New Trends in Database Systems

Similar documents
DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li

Data Model and Management

TrajAnalytics: A software system for visual analysis of urban trajectory data

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li

A Novel Method for Activity Place Sensing Based on Behavior Pattern Mining Using Crowdsourcing Trajectory Data

Efficient Orienteering-Route Search over Uncertain Spatial Datasets

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Towards Linked Data and ontology development for the semantic enrichment of volunteered geo-information

Land Administration and Management: Big Data, Fast Data, Semantics, Graph Databases, Security, Collaboration, Open Source, Shareable Information

Oracle Big Data Spatial and Graph: Spatial Features Roberto Infante 11/11/2015 Latin America Geospatial Forum

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Webinar Series TMIP VISION

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

The Era of Big Spatial Data: A Survey

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

White Paper: Next generation disaster data infrastructure CODATA LODGD Task Group 2017

CrowdPath: A Framework for Next Generation Routing Services using Volunteered Geographic Information

1. Inroduction to Data Mininig

Simba: Efficient In-Memory Spatial Analytics.

Combining Government and Linked Open Data in Emergency Management

A data-driven framework for archiving and exploring social media data

Chapter 1, Introduction

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Visual Traffic Jam Analysis based on Trajectory Data

3 Data, Data Mining. Chengkai Li

Generating Traffic Data

Visualisation of Abstract Information

Spatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018

Nearest Neighbor Queries

Chapter 24 NOSQL Databases and Big Data Storage Systems

Dr. John Snow s London Cholera Map (1854) Very early example of data science

Multidimensional Data and Modelling - DBMS

Design Considerations on Implementing an Indoor Moving Objects Management System

Apache Spark 2.0. Matei

Latent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017

Training courses. Course Overview Details Audience Duration. Applying GIS

SnapSite Online Quick Start Guide

BSC Smart Cities Initiative

MySQL Cluster Web Scalability, % Availability. Andrew

MobiEyes: Distributed Architecture for Location-based Services

Multidimensional (spatial) Data and Modelling (2)

Geospatial Analytics at Scale

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith

THINK RESILIENCY 2.0 WITH VIZONOMY

Elysium Technologies Private Limited::IEEE Final year Project

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Introduction to Spatial Database Systems. Outline

Oracle NoSQL Database Enterprise Edition, Version 18.1

CSE 344 Final Review. August 16 th

Advanced Data Types and New Applications

Oracle Spatial Summit 2015 Best Practices for Developing Geospatial Apps for the Cloud

Oracle Spatial A Unifying Framework at the Utah Department Of Transportation

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

Principles of Data Management. Lecture #14 (Spatial Data Management)

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

Efficient, Scalable, and Provenance-Aware Management of Linked Data

CIB Session 12th NoSQL Databases Structures

SLIPO. Scalable Linking and Integration of Big POI data. Giorgos Giannopoulos IMIS/Athena RC

IEEE 2013 JAVA PROJECTS Contact No: KNOWLEDGE AND DATA ENGINEERING

Northern India Engineering College, New Delhi Question Bank Database Management System. B. Tech. Mechanical & Automation Engineering V Semester

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

Traces. The main idea: Use traceroutes to create a graph model of the network, use graph traversal algorithms to determine proximity of nodes

[DEMO INTRO TO CAERUS GEO] American Red Cross International Services

QUERY PROCESSING AND OPTIMIZATION FOR PICTORIAL QUERY TREES HANAN SAMET

Visualisation of Abstract Information

An Introduction to Spatial Databases

Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016

Spatial Analysis (Vector) II

Lecture 6: GIS Spatial Analysis. GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

Big Data Architect.

Representing Geography

Chapter 1: Introduction to Spatial Databases

Analysis of Big Data using GeoMesa

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase

relational Key-value Graph Object Document

CrateDB for Time Series. How CrateDB compares to specialized time series data stores

Introduction to Geodatabase and Spatial Management in ArcGIS. Craig Gillgrass Esri

Graph Data Management

Intelligent Traffic System: Road Networks with Time-Weighted Graphs

Network Analyst: Performing Network Analysis

Spatial Keyword Search. Presented by KWOK Chung Hin, WONG Kam Kwai

New Developments in Spark

Spatial data and QGIS

DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li

Convergence and Collaboration: Transforming Business Process and Workflows

Georeferencing. Georeferencing: = linking a layer or dataset with spatial coordinates. Registration: = lining up layers with each other

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Web GIS: Principles and Applications. Pinde Fu, Ph.D. Project Lead / Senior Developer Professional Services Division

Void main Technologies

Distributed Graph Storage. Veronika Molnár, UZH

IST 210. Introduction to Spatial Databases

A Conceptual Design Towards Semantic Geospatial Data Access

CREATING VIRTUAL SEMANTIC GRAPHS ON TOP OF BIG DATA FROM SPACE. Konstantina Bereta and Manolis Koubarakis

So, we want to perform the following query:

Instructor: Dr. Hanna A. Kirolous RFID Automated Library Management System

Taking Benefit from the User Density in Large Cities for Delivering SMS

Spatio-Temporal Routing Algorithms Panel on Space-Time Research in GIScience Intl. Conference on Geographic Information Science 2012

Location Aware Social Networks User Profiling Using Big Data Analytics

LECTURE 2 SPATIAL DATA MODELS

Transcription:

New Trends in Database Systems Ahmed Eldawy 9/29/2016 1

Spatial and Spatio-temporal data 9/29/2016 2

What is spatial data Geographical data Medical images 9/29/2016 Astronomical data Trajectories 3

Application of spatial data Tracking of infectious disease Geo-targeted advertising Geographic Information Systems (GIS) Identifying local events/groups Disaster recovery/eviction plans Routing 9/29/2016 4

Query processing Point selection Range selection Nearest neighbor Spatial join 9/29/2016 5

Indexing A better organization of records to speed up the query processing R-tree Why do we need new indexes? 9/29/2016 Quad-tree 6

Selectivity estimation Estimates the size of the answer without having to run the query Useful for query optimization e.g., range selection queries Cost model for spatial join on big data 9/29/2016 7

Road networks Road intersections as graph vertices Road segments as (weighted) edges Planar graphs Adds topology semantics to queries e.g., nearest neighbor and clustering Build and visualize a 3D road network 9/29/2016 8

Big Spatial Data Four V s of big data Volume Partitioning and distributed query processing Velocity In-memory indexes Flushing policy Variety Combine vector and raster data Veracity Inherent errors in location data 9/29/2016 9

Project ideas for BSD A benchmark for BSD systems Real data is available A mix of range and spatial join queries Run on a few systems Implement a new query on a BSD system spatial join in AsterixDB KNN join on SpatialHadoop Aggregated visualization of BSD e.g., color code cities by number of tweets Integrate AsterixDB with a visualization server 9/29/2016 10

Volunteered Geographic Information (VGI) Collection of geographic data by volunteers OpenStreetMap How to take user locations into account? Identifying experts 9/29/2016 11

Ridesharing Uber Matching drivers and passengers Tracking vehicles 9/29/2016 Figure from http://www.eia.gov/todayinenergy/detail.php?id=13531 12

Spatial keyword queries Keywords are assigned to each record Restaurant, cheap, kids Expand queries to incorporate keywords Range queries: Find all *restaurants* for *kids* within 10 miles Nearest neighbor queries: Find the nearest *gas station* that accepts *credit card* Join queries: Find *excellent* *schools* and *2br* *houses* that are within a distance of five miles 9/29/2016 13

Indoor Environment Motivated by new technologies WiFi and Bluetooth localization RFID NFC Challenges 3D localization Indoor maps Indoor routing 9/29/2016 14

Geo-image browser 9/29/2016 15

Geo-image browser 9/29/2016 16

Geo-image browser 9/29/2016 17

Emerging Applications 9/29/2016 18

Traditional Applications Relational data schema Tables, columns, and rows Normalization Relational operations Select, project, join, group and aggregate Descriptive query language SQL ACID transaction guarantees 9/29/2016 19

New Requirements Schema-less Hierarchical formats New operations Closures Similarities Scientific language R, Matlab Weak or no transaction guarantees 9/29/2016 20

Social Networks Eventual consistency High volume + High velocity 9/29/2016 21

Graph processing Social networks Web graphs Call Detail Records (CDR) Road networks Brain simulation data Knowledge base and RDF data A benchmark for big graph systems, e.g., GraphX, Giraph, and Pregelix 9/29/2016 22

Data cleaning Name Email Affiliation ZIP code Mark mark@ucr.edu UC Riverside 922521 Anthony james@uci.edu UC Riverside 92697 Name Affiliation Area code Phone Mark UCR 951 555-5555 Tony UC Irvine 949 555-5555 How many errors can you spot? Can you fix any of them? Data cleaning of big data on Spark 9/29/2016 23

Crowd sourcing Outsource *hard* problems to people e.g., Select all pictures with a waterfall Do these two pictures show the same person? Which query result is more relevant? Challenges Minimize cost Achieve a high accuracy Task allocation 9/29/2016 24

Visualization Explore big data through visualization Challenge: Make these visualizations scalable and interactive 9/29/2016 25

Privacy/Security Challenge: Keep data safe while providing scalable computing 9/29/2016 26

Summary Spatial and spatio-temporal data Query processing Big spatial data Indexing Road networks Emerging applications Social networks Graph processing Visualization Data cleaning 9/29/2016 27

Thank You! 9/29/2016 28