Approximately Uniform Random Sampling in Sensor Networks

Similar documents
Approximately Uniform Random Sampling in Sensor Networks

Challenges in Geographic Routing: Sparse Networks, Obstacles, and Traffic Provisioning

Geographic Routing in Simulation: GPSR

Approximate Aggregation Techniques for Sensor Databases

Fractional Cascading in Wireless. Jie Gao Computer Science Department Stony Brook University

Geographic Routing Without Location Information. AP, Sylvia, Ion, Scott and Christos

Summarizing and mining inverse distributions on data streams via dynamic inverse sampling

Outline. Motivation. Traditional Database Systems. A Distributed Indexing Scheme for Multi-dimensional Range Queries in Sensor Networks

Data-Centric Query in Sensor Networks

KD-Tree Algorithm for Propensity Score Matching PhD Qualifying Exam Defense

An efficient implementation of the greedy forwarding strategy

Routing in Sensor Networks

Sensor Tasking and Control

Geographic and Diversity Routing in Mesh Networks

Problem 1: Complexity of Update Rules for Logistic Regression

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets

BGR: Blind Geographic Routing for Sensor Networks

Simulations of the quadrilateral-based localization

PROJECT REPORT VIRTUAL COORDINATE GENERATOR AND ROUTING SIMULATION TOOL FOR WIRELESS SENSOR NETWORKS IN 2 AND 3-DIMENSIONAL NETWORK SPACE

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC

Motivation and Basics Flat networks Hierarchy by dominating sets Hierarchy by clustering Adaptive node activity. Topology Control

Networking Sensors, II

PANEL: Position-based Aggregator Node Election in Wireless Sensor Networks

Geometric data structures:

One-Pass Streaming Algorithms

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1

Fosca Giannotti et al,.

The Capacity of Wireless Networks

Choosing a Random Peer

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Query Evaluation in Wireless Sensor Networks

Ad hoc and Sensor Networks Topology control

A Location-based Directional Route Discovery (LDRD) Protocol in Mobile Ad-hoc Networks

EFFICIENT DATA TRANSMISSION AND SECURE COMMUNICATION IN VANETS USING NODE-PRIORITY AND CERTIFICATE REVOCATION MECHANISM

Processing Rank-Aware Queries in P2P Systems

AVMEM: Availability-Aware Overlays for Management Operations in Non-cooperative Distributed Systems

Edexcel Linear GCSE Higher Checklist

On Biased Reservoir Sampling in the Presence of Stream Evolution

Constructing Popular Routes from Uncertain Trajectories

Distributed Voronoi diagram computation in wireless sensor networks

Applications. Oversampled 3D scan data. ~150k triangles ~80k triangles

Continuous. Covering. Location. Problems. Louis Luangkesorn. Introduction. Continuous Covering. Full Covering. Preliminaries.

Important issues. Query the Sensor Network. Challenges. Challenges. In-network network data aggregation. Distributed In-network network Storage

A Computational Theory of Clustering

Computing Aggregate Functions in Sensor Networks

Mining Data Streams. Outline [Garofalakis, Gehrke & Rastogi 2002] Introduction. Summarization Methods. Clustering Data Streams

Motion Planning. Howie CHoset

Geographical routing 1

Probabilistic Double-Distance Algorithm of Search after Static or Moving Target by Autonomous Mobile Agent

Exploratory Data Analysis

Classifiers and Detection. D.A. Forsyth

MLS An Efficient Location Service for Mobile Ad Hoc Networks Roland Flury Roger Wattenhofer

Highway Dimension and Provably Efficient Shortest Paths Algorithms

DS504/CS586: Big Data Analytics Big Data Clustering II

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith

Energy-Efficient Communication Protocol for Wireless Micro-sensor Networks

Geometric Routing: Of Theory and Practice

Topology Control in 3-Dimensional Networks & Algorithms for Multi-Channel Aggregated Co

Clustering Algorithm (DBSCAN) VISHAL BHARTI Computer Science Dept. GC, CUNY

Clustering in Data Mining

Codes, Bloom Filters, and Overlay Networks. Michael Mitzenmacher

A Scalable Content- Addressable Network

Locality- Sensitive Hashing Random Projections for NN Search

[Kleinberg04] J. Kleinberg, A. Slivkins, T. Wexler. Triangulation and Embedding using Small Sets of Beacons. Proc. 45th IEEE Symposium on Foundations

Beyond A*: Speeding up pathfinding through hierarchical abstraction. Daniel Harabor NICTA & The Australian National University

An Energy-aware Greedy Perimeter Stateless Routing Protocol for Mobile Ad hoc Networks

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

Otfried Cheong. Alon Efrat. Sariel Har-Peled. TU Eindhoven, Netherlands. University of Arizona UIUC

Motivation. The Impact of DHT Routing Geometry on Resilience and Proximity. Different components of analysis. Approach:Component-based analysis

Information-Driven Dynamic Sensor Collaboration for Tracking Applications

Information Brokerage

The Impact of DHT Routing Geometry on Resilience and Proximity. Acknowledgement. Motivation

Distributed Passive Routing Decisions in Mobile Ad-Hoc Networks

OPTICAL NETWORKS. Virtual Topology Design. A. Gençata İTÜ, Dept. Computer Engineering 2005

Landmark-based routing

Accelerating Geometric Queries. Computer Graphics CMU /15-662, Fall 2016

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

COMP 465: Data Mining Still More on Clustering

Hyperbolic Traffic Load Centrality for Large-Scale Complex Communications Networks

Local Algorithms for Sparse Spanning Graphs

Multidimensional Indexes [14]

On low-latency-capable topologies, and their impact on the design of intra-domain routing

26 The closest pair problem

SMITE: A Stochastic Compressive Data Collection. Sensor Networks

Can work in a group of at most 3 students.! Can work individually! If you work in a group of 2 or 3 students,!

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K.

Outline. Introduction. Outline. Introduction (Cont.) Introduction (Cont.)

Wireless Network Capacity. Nitin Vaidya

Nonparametric Importance Sampling for Big Data

Mutually Exclusive Data Dissemination in the Mobile Publish/Subscribe System

Geostatistics 2D GMS 7.0 TUTORIALS. 1 Introduction. 1.1 Contents

Mesh Decimation. Mark Pauly

Algorithms for GIS:! Quadtrees

Estimation of Wirelength

GLIDER: Gradient Landmark-Based Distributed Routing for Sensor Networks. Stanford University. HP Labs

Nearest Neighbor with KD Trees

Extracting Rankings for Spatial Keyword Queries from GPS Data

Collision and Proximity Queries

CS 4649/7649 Robot Intelligence: Planning

Algorithms for Euclidean TSP

Transcription:

Approximately Uniform Random Sampling in Sensor Networks Boulat A. Bash, John W. Byers and Jeffrey Considine

Motivation Data aggregation Approximations to COUNT, SUM, AVG, MEDIAN Existing work does not use sampling TAG (Madden et al. 2002) State of the art: FM sketches (Considine et al. 2004) Randomized algorithms e.g. randomized routing

Introduction What is this talk about? Selecting (sampling) a random node in a sensornet Why is sampling hard in sensor networks? Unreliable and resource-constrained nodes Hostile environments High inter-node communication costs How do we measure costs? Total number of fixed-size messages sent per query

Outline Exact uniform random sampling Previous work Approximately uniform random sampling Naïve biased solution Our almost-unbiased algorithm Experimental validation Heuristics for improving samples Preliminary simulations Conclusions and future work

Sampling Problem Exact uniform random sampling Each sensor s is returned from network of n reachable sensors with probability 1/n Existing solution (Nath and Gibbons, 2003) Each sensor s generates (r s, ID s ) where r s is a random number Network returns ID of the sensor with minimal r s Cost: Ө(n) transmissions

Relaxed Sampling Problem (ε, δ)-sampling Each sensor s is returned with probability no greater than (1+ε)/n, and at least (1-δ) n sensors are output with probability at least 1/n

Naïve Solution Spatial Sampling Return the sensor closest to a random (x,y) Yes!!! n=10 Possible with geographic routing (GPSR 2001) Nodes know own coordinates (GPS, virtual coords, pre-loading) Fully distributed; state limited to neighbors locations Cost: Ө(D) transmissions, D is network diameter

Pitfall in Spatial Sampling Bias towards large Voronoi cells Definition: Set of points closer to sensor s than any other sensor (Descartes, 1644) Areas known to vary widely Voronoi diagram n=10

Removing Bias Rejection method Ugh Yes!!! n=10 In each cell, mark area of smallest Voronoi cell Only accept probes that land in marked regions In practice, use Bernoulli trial for acceptance with P[accept] =A min /A s (von Neumann, 1951) Find own cell area A s using neighbor locations Need c = A avg /A min probes per sample on average

Rejection-based Sampling Problem: Minimum cell area may be small Solution: Under-sample some nodes Let A threshold A min be globally-known cell area 1. Route probe to sensor s closest to random (x,y) 2. If A s < A threshold, then sensor s accepts Else, sensor accepts with Pr[acc] = A threshold /A s A threshold set by user For (ε, δ)-sampling, set to the area of the cell that is the k-quantile, where k = min δ,ε (1 + ε) Cost: Ө(cD) transmissions ( )

James Reserve Sensornet

James Reserve Sensornet 52 sensors E[#probes] ε δ 1.0 (naïve) 4.3 0.69 1.5 0.48 0.46 2.2 0.12 0.23 3.1 0.041 0.15 4.1 0.012 0.038 5.0 0.0072 0.019

Random topology 2 15 sensors randomly placed on a unit square E[#probes] ε δ 1.0 (naïve) 3.8 0.57 1.3 0.27 0.41 2.1 0.051 0.15 3.1 0.017 0.06 4.0 0.0079 0.029 5.0 0.0042 0.017

Improving Algorithm Put some nodes with small cells to sleep No sampling possible from sleeping nodes Similar to power-saving schemes (Ye et al. 2002) Virtual Coordinates Node locations assigned using local connectivity information (Rao et al. 2003) Hard lower bound on inter-sensor distances

Improving Algorithm Pointers Yes!!! Ok n=10 Large cells donate their unused area to nearby small cells When a large cell rejects, it can probabilistically forward the probe to one of its small neighbors

Simulation Objectives Avoid the following behavior Show feasibility of implementation Demonstrate scalability Contrast with previous work

Simulation Setup Existing simulation tools inadequate Lack of scalability to tens of thousands of nodes No middle-ground proof of concept tool SGNS Simple Geographic Network Simulator Implements GPSR Counts messages only, similar to TAG Simulator Simulate on 3 topologies James Reserve (JR) 52 nodes Random placement 2 15 nodes Synthetic JR-based 2 15 nodes

Synthesizing Topologies Trivial random placement is not adequate Humans do not behave randomly! First-principles approach to sensor placement Inspired by Li et al. 2004 1. Nodes have two mutually non-exclusive tasks: sensing and routing Areas of higher node densities 2. Humans are rational in sensor placements Minimum inter-nodal distance 3. Sensed environment is not predictable Non-uniformity in placement

First Principles in Action Voronoi cell size distributions: JR vs. random 1 2 1. Proportionally more nodes with below-average area in JR 2. Smallest cells in random topology much smaller: sensors are too close

Topology Generator STG Synthetic Topology Generator n=10, r=0.3 D=5 md=0.1 D=4 md=0.4 D=3.6 md=0.2 Inputs Number of nodes and transmission radius Set of non-overlapping axis-aligned rectangles Relative node density Minimum inter-nodal distance Connectivity requirement Iteratively place nodes on the rectangles at random

Inflating JR topology 2 15 -node version of 52-node JR topology 15 Rectangles: JR topology split into 3 5 grid Node density maintained by increasing area by 2 15 /52 No zero-node rectangles Minimum inter-nodal distance (MID) set to minimal distance between any two nodes in rectangle One-node rectangles: use probabilistically maximal MID < r Require connectivity Problem Scaling assumption

Synthetic JR-based topology Voronoi cell size distribution Looks better than random!

Simulation Results JR random synthetic n 52 32,768 32,768 Dimensions (sq ft) 2,127x1,306 53,341x32,660 53,341x32,660 Transmission radius (ft) 500 500 500 Number of samples 80 Million 80 Million 80 Million Expected number of probes 5 5 5 Total transmissions 8.74 Billion 42.38 Billion 25.24 Billion Transmissions per sample 109 530 315 Greedy mode 14.190 % 67.091 % 64.875 % Perimeter mode 1.331 % 3.746 % 0.008 % Closest node determination 84.480 % 29.164 % 35.117 % Transmissions per sample (excluding closest node determination) 17 375 205 N&G estimated trans. per sample 78 49,152 49,152 Large fraction of transmissions is due to closest node determination

Simulation Conclusions To do list Test sampling for data aggregation Compare costs with TAG, FM sketches Try inflating Voronoi cells of existing small topologies to obtain large synthetic topologies instead of rectangles Simulate pointers improvement Wish list Node/link loss models for SGNS Node mobility in SGNS Massive TOSSIM simulation

Conclusions New nearly-uniform random sampling algorithm Cost proportional to sending a point-to-point message Tunable (and generally small) sampling bias Proof-of-concept simulations show viability Future Work Extend to non-geographic predicates Reduce messaging costs for high number of probes Move beyond request/reply paradigm Apply to DHTs like Chord (King and Saia, 2004)

Backup slides von Neumann s Rejection Method Geographic Routing

Rejection Method von Neumann s rejection method (1951) Problem: impossible to sample from PDF f Idea: Sample indirectly from g and scale to f Solution: Draw sample t from g, but accept f (t) with probability c g (t) where c is a positive constant t }Problem! Reject Reject Accept Accept t t

Geographic Routing GPSR Greedy Perimeter Stateless Routing (Brad Karp and H.T. Kung, 2000) State limited to neighbor location information Greedy mode default protocol state Forward to neighbor closest to the destination

Geographic Routing (cont.) Voids where greedy forwarding fails Greedy mode Perimeter mode No nodes available in transmission range closer to destination then self Perimeter mode circumnavigate voids using the right-hand rule

Geographic Routing (cont.) Closest node determination Greedy mode Perimeter mode Visit every node on a perimeter enclosing probe destination Start and finish at the closest node