Querying the Sensor Network. TinyDB/TAG

Similar documents
Important issues. Query the Sensor Network. Challenges. Challenges. In-network network data aggregation. Distributed In-network network Storage

Querying Sensor Networks. Sam Madden

Information Management I: Sensor Database, Querying, Publish & Subscribe, Information Summarization + SIGMOD 2003 paper

Acquisitional Query Processing in TinyDB

DSMS Part 1, 2 & 3. Data Stream Management. Example: Sensor Networks. Some Sensornet Applications. Data Stream Management Systems

TinyDB and TASK. Sensor Network in a Box SMARTER SENSORS IN SILICON 1

TAG: A TINY AGGREGATION SERVICE FOR AD-HOC SENSOR NETWORKS

Data Management in Sensor Networks

Tag a Tiny Aggregation Service for Ad-Hoc Sensor Networks. Samuel Madden, Michael Franklin, Joseph Hellerstein,Wei Hong UC Berkeley Usinex OSDI 02

Outline. TinyDB-Overview. TinyDB-Design, Code and implementation. TinyDB Overview TinyDB and Java API TinyDB Internals Extending TinyDB

Wireless Sensor networks: a data centric overview. Politecnico di Milano Joint work with: C. Bolchini F.A. Schreiber other colleagues and students

CE693: Adv. Computer Networking

TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks

Sensor Network Protocol Design and Implementation. Philip Levis UC Berkeley

TAG: Tiny AGgregate Queries in Ad-Hoc Sensor Networks

Scaling Down. Robert Grimm New York University

Query Processing & Optimization

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

References. Introduction. Publish/Subscribe paradigm. In a wireless sensor network, a node is often interested in some information, but

Advanced Database Systems

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Introduction to Wireless Sensor Network. Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University

Querying Data with Transact SQL

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka

SQL functions fit into two broad categories: Data definition language Data manipulation language

Chapter 3. Algorithms for Query Processing and Optimization

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Chapter 11: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 14: Query Optimization

Approximate Aggregation Techniques for Sensor Databases

Module 9: Selectivity Estimation

Integrated Routing and Query Processing in Wireless Sensor Networks

SQL: Queries, Programming, Triggers. Basic SQL Query. Conceptual Evaluation Strategy. Example of Conceptual Evaluation. A Note on Range Variables

Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761)

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

DBMS Query evaluation

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CSIT5300: Advanced Database Systems

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Chapter 12: Query Processing. Chapter 12: Query Processing

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

ATYPICAL RELATIONAL QUERY OPTIMIZER

What is Multicasting? Multicasting Fundamentals. Unicast Transmission. Agenda. L70 - Multicasting Fundamentals. L70 - Multicasting Fundamentals

Chapter 12: Query Processing

Database System Concepts

CS 582 Database Management Systems II

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems

Relational Query Optimization

Database System Concepts

Enterprise Database Systems

SQL. Chapter 5 FROM WHERE

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Querying Data with Transact-SQL

7. Query Processing and Optimization

Relational Query Optimization

Chapter 12: Indexing and Hashing. Basic Concepts

Querying Data with Transact-SQL

EECS 647: Introduction to Database Systems

Query processing and optimization

Optimized Query Processing for Wireless Sensor Networks

Chapter 12: Indexing and Hashing

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

CompSci 516 Data Intensive Computing Systems

SQL: Queries, Constraints, Triggers

Efficient Time Triggered Query Processing in Wireless Sensor Networks

Parser: SQL parse tree

Networking Acronym Smorgasbord: , DVMRP, CBT, WFQ

RCRT:Rate-Controlled Reliable Transport Protocol for Wireless Sensor Networks

Chapter 19 Query Optimization

SQL: Queries, Programming, Triggers

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Oracle Syllabus Course code-r10605 SQL

Power-Aware In-Network Query Processing for Sensor Data

TinyOS. Jan S. Rellermeyer

ECE 158A: Lecture 13. Fall 2015

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Module 4. Implementation of XQuery. Part 0: Background on relational query processing

CMSC424: Database Design. Instructor: Amol Deshpande

A New Approach of Data Aggregation in Wireless Sensor Networks

Chapter 13: Query Processing

Query Optimization Overview. COSC 404 Database System Implementation. Query Optimization. Query Processor Components The Parser

Database Applications (15-415)

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843

Temporal Aggregation and Join

Linked Stream Data Processing Part I: Basic Concepts & Modeling

Querying Microsoft SQL Server 2014

INF5100 Mandatory Exercise. Autumn 2009

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Principles of Data Management. Lecture #9 (Query Processing Overview)

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions

Multicast Communications. Slide Set were original prepared by Dr. Tatsuya Susa

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions

ORACLE TRAINING CURRICULUM. Relational Databases and Relational Database Management Systems

Transcription:

Querying the Sensor Network TinyDB/TAG 1

TAG: Tiny Aggregation Query Distribution: aggregate queries are pushed down the network to construct a spanning tree. Root broadcasts the query and specifies its level l Each node that hears message assigns its own level to be l+1 and chooses as parent a node with smallest level. Each node rebroadcasts message until all nodes have received it Resulting structure is a spanning tree rooted at the query node. Data Collection: aggregate values are routed up the tree. Internal node aggregates the partial data received from its subtree. 2

Tree-based Routing Tree-based routing Used in: Query delivery Data collection In-network aggregation Q:SELECT R:{ } R:{ } Q B Q Q A Q Q Q R:{ } C Q D Q R:{ } R:{ } Q E Q Q Q F 3

TAG example Query distribution Query collection 1 1 2 3 2 3 4 4 6 5 6 5 4

Data Model Entire sensor network as one single, infinitely-long logical table: sensors Columns consist of all the attributes defined in the network Typical attributes: Sensor readings Meta-data: node id, location, etc. Internal states: routing tree parent, timestamp, queue length, etc. Nodes return NULL for unknown attributes On server, all attributes are defined in catalog.xml Discussion: other alternative data models? 5

Query Language (TinySQL) SELECT <aggregates>, <attributes> [FROM {sensors <buffer>}] [WHERE <predicates>] [GROUP BY <attributes>] [SAMPLE PERIOD <const> ONCE] [INTO <buffer>] 6

Comparison with SQL Single table in FROM clause (exception: storage points ) Only conjunctive comparison predicates in WHERE and HAVING No subqueries No column alias in SELECT clause Arithmetic expressions limited to column op constant Only fundamental difference: SAMPLE PERIOD clause 7

TinySQL Examples Find the sensors in bright nests. 1 SELECT nodeid, nestno, light FROM sensors WHERE light > 400 EPOCH DURATION 1s Sensors Epoch Nodeid nestno Light 0 1 17 455 0 2 25 389 1 1 17 422 1 2 25 405 8

TinySQL Examples (cont.) 2 SELECT AVG(sound) FROM sensors EPOCH DURATION 10s Count the number of occupied nests in each loud region of the island. 3 SELECT region, CNT(occupied) AVG(sound) FROM sensors GROUP BY region HAVING AVG(sound) > 200 EPOCH DURATION 10s Epoch region CNT( ) AVG( ) 0 North 3 360 0 South 3 520 1 North 3 370 1 South 3 520 Regions w/ AVG(sound) > 200 9

Basic Aggregation In each epoch: Each node samples local sensors once Generates partial state record (PSR) local readings readings from children Outputs PSR during assigned comm. interval 2 1 4 3 At end of epoch, PSR for whole network output at root New result on each successive epoch 5 Extras: Predicate-based partitioning via GROUP BY 10

Interval # Illustration: Aggregation SELECT COUNT(*) FROM sensors Sensor # 1 2 3 4 5 1 Interval 4 Epoch 4 1 2 3 3 2 1 4 4 1 5 11

Interval # Illustration: Aggregation SELECT COUNT(*) FROM sensors Sensor # 1 2 3 4 5 1 Interval 3 4 1 2 3 3 2 2 1 4 2 4 5 12

Interval # Illustration: Aggregation SELECT COUNT(*) FROM sensors Sensor # 1 Interval 2 1 2 3 4 5 1 3 4 1 2 3 3 2 2 1 3 1 4 4 5 13

Interval # Illustration: Aggregation SELECT COUNT(*) FROM sensors 5 Interval 1 Sensor # 1 1 2 3 4 5 4 1 2 3 3 2 2 1 3 1 5 4 4 5 14

TAG Algorithm w/ GROUP-ing 1 Temp: 20 Light : 10 1 Group # AVG ----------------------- 1 10 2 30 3 25 Temp: 20 Light : 50 3 2 Temp: 10 Light : 10 Group # AVG ----------------------- 1 10 2 50 3 25 3 2 Group # AVG ----------------------- 1 10 5 4 Temp: 30 Light : 25 Temp: 10 Light : 15 SELECT AVG( light ) FROM Sensors GROUP BY temp/10 EPOCH DURATION... 5 4 Group # AVG ----------------------- 1 10 3 25 Group # AVG ----------------------- 1 10 6 Temp: 10 Light : 5 Group # AVG ----------------------- 1 5 6 Sensor measurements within one epoch Aggregation state progress during one epoch 15

Aggregation Framework As in extensible databases, TAG supports any aggregation function conforming to: Agg n ={f init, f merge, f evaluate } F init {a 0 } <a 0 > Partial State Record (PSR) F merge {<a 1 >,<a 2 >} <a 12 > F evaluate {<a 1 >} aggregate value Example: Average AVG init {v} <v,1> AVG merge {<S 1, C 1 >, <S 2, C 2 >} < S 1 + S 2, C 1 + C 2 > AVG evaluate {<S, C>} S/C Restriction: Merge associative, commutative 16

Considerations about aggregations Packet loss? Acknowledgement and re-transmit? Robust routing? Packets arriving out of order or in duplicates? Double count? Size of the aggregates? Message size growth? 17

Classes of aggregations Exemplary aggregates return one or more representative values from the set of all values; summary aggregates compute some properties over all values. MAX, MIN: exemplary; SUM, AVERAGE: summary. Exemplary aggregates are prone to packet loss and not amendable to sampling. Summary aggregates of random samples can be treated as a robust estimation. 18

Classes of aggregations Duplicate insensitive aggregates are unaffected by duplicate readings. Examples: MAX, MIN. Independent of routing topology. Combine with robust routing (multi-path). 19

Classes of aggregations Monotonic aggregates: when two partial records s 1 and s 2 are combined to s, either e(s) max{e(s 1 ), e(s 2 )} or e(s) min{e(s 1 ), e(s 2 )}. Examples: MAX, MIN. Certain predicates (such as HAVING) can be applied early in the network to reduce the communication cost. 20

Classes of aggregations Good worst bad Partial state of the aggregates: Distributive: the partial state is simply the aggregate for the partial data. The size is the same with the size of the final aggregate. Example: MAX, MIN, SUM Algebraic: partial records are of constant size. Example: AVERAGE. Holistic: the partial state records are proportional in size to the partial data. Example: MEDIAN. Unique: partial state is proportional to the number of distinct values. Example: COUNT DISTINCT. Content-sensitive: partial state is proportional to some (statistical) properties of the data. Example: fixed-size bucket histogram, wavelet, etc. 21

Classes of aggregates Duplicate sensitive Exemplary, Summary Monotonic Partial State MAX, MIN No E Yes Distributive COUNT, SUM Yes S Yes Distributive AVERAGE Yes S No Algebraic MEDIAN Yes E No Holistic COUNT DISTINCT No S Yes Unique HISTOGRAM Yes S No Contentsensitive 22

Use Multiple Parents Use graph structure Increase delivery probability with no communication overhead For duplicate insensitive aggregates, or Aggs expressible as sum of parts Send (part of) aggregate to all parents In just one message, via multicast Assuming independence, decreases variance SELECT COUNT(*) R P(link xmit successful) = p # of parents = n B C P(success from A->R) = p 2 E(cnt) = c * p 2 Var(cnt) = c 2 * p 2 * (1 p 2 ) V E(cnt) = n * (c/n * p 2 ) c/n Var(cnt) = n * (c/n) 2 * p 2 * (1 p 2 ) = V/n A n = 2 c/n 23

Avg. COUNT Multiple Parents Results Better than previous analysis expected! Losses aren t independent! Insight: spreads data over many links 1400 1200 1000 800 600 400 200 0 Benefit of Result Splitting (COUNT query) Splitting No Splitting (2500 nodes, lossy radio model, 6 parents per node) 24

Multiple Parents Results No Splitting With Splitting Critical Link! 25

TinyDB Architecture PC side TinyDB GUI TinyDB Client API JDBC DBMS Mote side 0 0 TinyDB query 1 2 processor 8 3 4 5 6 Sensor network 7 27 26

Multihop Networking Revised implementation of tree based routing Parent Selection: Use parent with best Quality link R:{ } B A B R:{ } B B B C Node D Neigh Qual B.75 C.66 E.45 F.82 Node C Neigh Qual A.5 B.44 D.53 F.35 R:{ } R:{ } B B E D B B B B B F B R:{ } 27

Data model revisited A single, append-only table Sensors (nodeid, time, light, temp, ) Just a conceptual view for posing queries; in reality: Data is not already there at query time Traditional database: queries independent of acquisition Here: queries drive acquisition Didn t ask for light? Then it won t be sampled! Data may not be at one place Like a distributed database, but here nodes/network are much less powerful/reliable Data won t be around forever Similar to stream data processing 28

Acquisitional Query Processing What s really new & different about databases on (mote-based) sensor networks? TinyDB s answer: Long running queries on physically embedded devices that control when and where and with what frequency data is collected Versus traditional DBMS where data is provided a priori For a distributed, embedded sensing environment, ACQP provides a framework for addressing issues of When, where, and how often data is sensed/sampled Which data is delivered 29

Acquisitional Query Processing How does the user control acquisition? Rates or lifetimes Event-based triggers How should the query be processed? Sampling as an operator, Power-optimal ordering Frequent events as joins Which nodes have relevant data? Semantic Routing Tree for effective pruning Nodes that are queried together route together Which samples should be transmitted? Pick most valuable? Adaptive transmission & sampling rates 30

Rate & Lifetime Queries Rate query SELECT nodeid, light, temp FROM sensors SAMPLE INTERVAL 1s FOR 10s Lifetime query SELECT LIFETIME 30 days SELECT LIFETIME 10 days MIN SAMPLE INTERVAL 1s A Estimate sampling rate that achieves this May not be able to transmit all the data 31

Processing Lifetimes: Issues Provide formulas for estimating power consumption: set maximum per-node sampling rates What makes this difficult? estimating the selectivity of predicates amount transmitted by a node varies widely root is a bottleneck: all nodes rates must correspond to it aggregation vs. sending individual values multiple sensing types (temp, accel) with different drain conditions change: multiple queries, burstiness, message losses What to do when can t transmit all the data 32

Storage points CREATE STORAGE POINT recentlight SIZE 8 AS (SELECT nodeid, light FROM Sensors SAMPLE PERIOD 10s); B A sliding window of recent readings, materialized locally Joining with the Sensors stream SELECT COUNT(*) FROM Sensors s, recentlight rl C WHERE rl.nodeid = s.nodeid AND s.light < rl.light SAMPLE PERIOD 10s; TinyDB only allows joining a stream with a storage point! 33

Event-based Queries ON event SELECT Run query only when interesting events happens Event examples Button pushed Message arrival Bird enters nest Analogous to triggers but events are user-defined 34

Event Based Processing ACQP want to initiate queries in response to events ON EVENT bird-detect(loc): SELECT AVG(s.light), AVG(s.temp), event.loc FROM sensors AS s WHERE dist(s.loc, event.loc) < 10m SAMPLE PERIOD 2s FOR 30s Reports the average light and temperature level at sensors near a bird nest where a bird has been detected E.g., New query instance generated for as long as bird is there 35

Event Based Processing Single external interrupt 36