Massive Data Algorithmics

Similar documents
Massive Data Algorithmics

Surface Analysis. Data for Surface Analysis. What are Surfaces 4/22/2010

Geographic Surfaces. David Tenenbaum EEOS 383 UMass Boston

Modeling & Analyzing Massive Terrain Data Sets (STREAM Project) Pankaj K. Agarwal. Workshop on Algorithms for Modern Massive Data Sets

Applied Cartography and Introduction to GIS GEOG 2017 EL. Lecture-7 Chapters 13 and 14

Surface Creation & Analysis with 3D Analyst

Vector Data Analysis Working with Topographic Data. Vector data analysis working with topographic data.

Flow on terrains. Laura Toma csci 3225 Algorithms for GIS Bowdoin College

Computing Pfafstetter Labelings I/O-Efficiently (abstract)

Contents of Lecture. Surface (Terrain) Data Models. Terrain Surface Representation. Sampling in Surface Model DEM

Spatial hydrology. Methods and tools to evaluate the impact of natural and anthropogenic artefacts on runoff pathways.

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Thiago L. Gomes Salles V. G. Magalhães Marcus V. A. Andrade Guilherme C. Pena. Universidade Federal de Viçosa (UFV)

Algorithms for GIS csci3225

TERRASTREAM: From Elevation Data to Watershed Hierarchies

Simplifying massive planar subdivisions

Geographical Information System (Dam and Watershed Analysis)

N.J.P.L.S. An Introduction to LiDAR Concepts and Applications

Improvement of the Edge-based Morphological (EM) method for lidar data filtering

Massive Data Algorithmics. Lecture 1: Introduction

Channel Conditions in the Onion Creek Watershed. Integrating High Resolution Elevation Data in Flood Forecasting

ECE697AA Lecture 21. Packet Classification

GEOGRAPHIC INFORMATION SYSTEMS Lecture 25: 3D Analyst

Lecture 4: Digital Elevation Models

Parallel calculation of LS factor for regional scale soil erosion assessment

Introduction to 3D Analysis. Jinwu Ma Jie Chang Khalid Duri

Overview. 1. Aerial LiDAR in Wisconsin (20 minutes) 2. Demonstration of data in CAD (30 minutes) 3. High Density LiDAR (20 minutes)

Surface Analysis with 3D Analyst

Simplifying Massive Contour Maps

Lecture 6: GIS Spatial Analysis. GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

L7 Raster Algorithms

Improved Visibility Computation on Massive Grid Terrains

Assimilation of Break line and LiDAR Data within ESRI s Terrain Data Structure (TDS) for creating a Multi-Resolution Terrain Model

DIGITAL TERRAIN MODELLING. Endre Katona University of Szeged Department of Informatics

BRIEF EXAMPLES OF PRACTICAL USES OF LIDAR

Esri International User Conference. July San Diego Convention Center. Lidar Solutions. Clayton Crawford

Computing Visibility on Terrains in External Memory

EVALUATING AND COMPRESSING HYDROLOGY ON SIMPLIFIED TERRAIN

APPENDIX E2. Vernal Pool Watershed Mapping

Module 4: Index Structures Lecture 16: Voronoi Diagrams and Tries. The Lecture Contains: Voronoi diagrams. Tries. Index structures

[Youn *, 5(11): November 2018] ISSN DOI /zenodo Impact Factor

Reality Check: Processing LiDAR Data. A story of data, more data and some more data

Class #2. Data Models: maps as models of reality, geographical and attribute measurement & vector and raster (and other) data structures

MODULE 1 BASIC LIDAR TECHNIQUES

Integration of airborne LiDAR and hyperspectral remote sensing data to support the Vegetation Resources Inventory and sustainable forest management

2010 LiDAR Project. GIS User Group Meeting June 30, 2010

EIVA NaviModel3. Handling of Pipeline Inspection Data in a 3D Environment

DATA MODELS IN GIS. Prachi Misra Sahoo I.A.S.R.I., New Delhi

Existing Elevation Data Sets. Quality Level 2 (QL2) Lidar Data Sets. Better Land Characterization More Accurate Results!

The 3D Analyst extension extends ArcGIS to support surface modeling and 3- dimensional visualization. 3D Shape Files

Raster Analysis. Overview Neighborhood Analysis Overlay Cost Surfaces. Arthur J. Lembo, Jr. Salisbury University

Import, view, edit, convert, and digitize triangulated irregular networks

Alaska Department of Transportation Roads to Resources Project LiDAR & Imagery Quality Assurance Report Juneau Access South Corridor

I/O-Algorithms Lars Arge

Light Detection and Ranging (LiDAR)

A DATA DRIVEN METHOD FOR FLAT ROOF BUILDING RECONSTRUCTION FROM LiDAR POINT CLOUDS

Outline of Presentation. Introduction to Overwatch Geospatial Software Feature Analyst and LIDAR Analyst Software

Conservation Applications of LiDAR. Terrain Analysis. Workshop Exercises

COMPARISON OF TWO METHODS FOR DERIVING SKELETON LINES OF TERRAIN

Computing Visibility on Terrains in External Memory

Raster Analysis. Overview Neighborhood Analysis Overlay Cost Surfaces. Arthur J. Lembo, Jr. Salisbury University

Parallel Flow-Direction and Contributing Area Calculation for Hydrology Analysis in Digital Elevation Models

Investigation of Sampling and Interpolation Techniques for DEMs Derived from Different Data Sources

Generate Digital Elevation Models Using Laser Altimetry (LIDAR) Data

DIGITAL TERRAIN MODELS

An Introduction to Lidar & Forestry May 2013

Ground and Non-Ground Filtering for Airborne LIDAR Data

Windstorm Simulation & Modeling Project

Synthetic Aperture Radar (SAR) Polarimetry for Wetland Mapping & Change Detection

LiDAR Derived Contours

On the Selection of an Interpolation Method for Creating a Terrain Model (TM) from LIDAR Data

A Method to Create a Single Photon LiDAR based Hydro-flattened DEM

Stream network delineation and scaling issues with high resolution data

Using Imagery for Intelligence Analysis

A MapReduce Algorithm for Polygon Retrieval in Geospatial Analysis

Point Cloud Classification

Topographic Survey. Topographic Survey. Topographic Survey. Topographic Survey. CIVL 1101 Surveying - Introduction to Topographic Mapping 1/7

Improved Applications with SAMB Derived 3 meter DTMs

Representing Geography

Algorithms for GIS csci3225

The WSC Data Users Guide

Tutorial 18: 3D and Spatial Analyst - Creating a TIN and Visual Analysis

Chapters 1 7: Overview

17/07/2013 RASTER DATA STRUCTURE GIS LECTURE 4 GIS DATA MODELS AND STRUCTURES RASTER DATA MODEL& STRUCTURE TIN- TRIANGULAR IRREGULAR NETWORK

Lecture 06. Raster and Vector Data Models. Part (1) Common Data Models. Raster. Vector. Points. Points. ( x,y ) Area. Area Line.

Statistical surfaces and interpolation. This is lecture ten

Lidar and GIS: Applications and Examples. Dan Hedges Clayton Crawford

SOME stereo image-matching methods require a user-selected

Tutorial (Intermediate level): Dense Cloud Classification and DTM generation with Agisoft PhotoScan Pro 1.1

A METHOD TO PREDICT ACCURACY OF LEAST SQUARES SURFACE MATCHING FOR AIRBORNE LASER SCANNING DATA SETS

Field-Scale Watershed Analysis

FOOTPRINTS EXTRACTION

Generate Digital Elevation Models Using Laser Altimetry (LIDAR) Data. Christopher Weed

Remote Sensing and GIS. GIS Spatial Overlay Analysis

Viewshed analysis. Chapter Line of sight analysis

LIDAR MAPPING FACT SHEET

Creating raster DEMs and DSMs from large lidar point collections. Summary. Coming up with a plan. Using the Point To Raster geoprocessing tool

Lecture 9. Raster Data Analysis. Tomislav Sapic GIS Technologist Faculty of Natural Resources Management Lakehead University

Pipeline Inspection Tools in a 3D Environment

Technical Considerations and Best Practices in Imagery and LiDAR Project Procurement

Transcription:

In the name of Allah Massive Data Algorithmics An Introduction

Overview MADALGO SCALGO Basic Concepts The TerraFlow Project STREAM The TerraStream Project TPIE

MADALGO- Introduction Center for MAssive Data ALGOrithmics A major basic research center funded by The Danish National Research Foundation Covers all areas of the design, analysis and implementation of algorithms and data structures for processing massive data

MADALGO- Four core research areas I/O-efficient algorithms Algorithms designed in a two-level external memory (or I/O-) model The memory hierarchy consists of a main memory of limited size M and an external memory (disk) of unlimited size the goal is to minimize the number of times a block of B consecutive elements is read (or written) from (to) disk (an I/O-operation, or simply I/O)

MADALGO- Four core research areas cache-oblivious algorithms Algorithms designed in the I/O-model but without knowledge of M and B and then analyzed as I/O-model algorithms Holds simultaneously on all levels of any multi-level memory hierarchy.

MADALGO- Four core research areas streaming algorithms Only one (or a small constant number of) sequential pass(es) over the data is (are) allowed Solve a given problem using significantly less space than the input data size Process each data element as fast as possible

MADALGO- Four core research areas algorithm engineering the design and analysis of practical algorithms efficient implementation of these algorithms experimentation that provide insight into their applicability and further improvements

SCALGO SCALGO: SCALable algorithmics Was founded in 2009 in Aarhus, Denmark Mission: to bring cutting-edge massive terrain data-processing technology to market

Terrain Terrain: The vertical and horizontal dimension of land surface

LIDAR LIDAR: Light Detection And Ranging an optical remote sensing technology measures the distance to, or other properties of, a target by illuminating the target with light often uses pulses from a laser

Point cloud A set of vertices in a three-dimensional coordinate system Usually defined by X, Y, and Z coordinates Typically intended to be representative of the external surface of an object

DEM DEM: Digital elevation model A digital model or 3D representation of a terrain's surface Two most used types of DEM are regular grid and triangulated irregular network (TIN)

Regular grid DEM a matrix of equally spaced points with each point having x, y and z coordinate values

Regular grid DEM- Quadtree a tree data structure in which each internal node has exactly four children most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions

Triangulated Irregular Network (TIN) irregularly distributed nodes and lines with three-dimensional coordinates arranged in a network of non-overlapping triangles

TIN- Delaunay triangulation A triangulation for a set of points such that no point is inside the circumcircle of any triangle maximizes the minimum angle of all the angles of the triangles in the triangulation tends to avoid skinny triangles

The TerraFlow Project Has emerged from the experiences with terrain analysis applications which do not scale up to large datasets a software package for computing flow routing and flow accumulation on massive grid-based terrains based on theoretically optimal algorithms designed using external memory paradigms

Flow direction, flow routing and flow accumulation The flow directions of a cell correspond to the directions in which water would flow if poured at that cell onto the terrain water cannot go uphill The flow routing problem: the problem of assigning flow directions to all cells in the DEM such that 1. flow directions do not induce any cycles; 2. every cell has a flow path off the edge of the terrain The flow accumulation of a terrain is an index which estimates the surface runoff for each cell in the terrain

STREAM- Introduction STREAM: Scalable Techniques for hi- Resolution Elevation data Analysis and Modeling Located in the CS department at Duke university funded by the U.S. Army Research Office

STREAM- Projects Constructing DEM developed two methods for efficiently converting LIDAR point sets to more conventional formats: Grid Construction: uses a quad-tree segmentation TIN Construction: uses a Delaunay triangulation algorithm Terrain Flow Modeling improvements to existing work done as part of the TerraFlow project

STREAM- Projects Noise Removal There is some level of noise in DEMs derived from LIDAR computes a persistence score for topological features uses this persistence score to remove small topological features likely the result of noise

STREAM- Projects Hierarchical Watershed Decomposition partitions a terrain into a hierarchy of nested watersheds

STREAM- Projects Topographic Change Detecting topographic change can quickly identify beach dunes damaged by hurricanes, monitor urban development or measure change in forest growth

TerraSTREAM- Introduction A series of libraries and front-ends for these libraries Allows the user to perform a series of computational tasks on very large digital elevation models The data is represented either as a TIN or a GRID A collaboration between Duke University CS researchers and researchers at MADALGO

TerraStream- Features DEM Construction Computes a digital elevation model (DEM) from a point cloud The input data is typically gathered using LIDAR Constructs both TINs and grids

TerraStream- Features DEM Topological Conditioning Simplifies digital elevation models by first identifying and then removing insignificant geographical features Significance is the feature's height, area and volume or any combination of these A feature is insignificant if its significance is smaller than some threshold specified by the user

TerraStream- Features Flow Routing Compute flow directions for each data point in a DEM The routing models supported are steepest-flow-descent multiple-flow-directions flux decomposition Flow Accumulation Accumulate amounts of, e.g., water on a DEM along flow paths as computed by the flow routing module

TerraStream- Features Flood Simulation Flood Mask computes a mask of the cells that are flooded if the water lever were raised 'x' units General Transforms a DEM to a new DEM The height of each cell in the produced DEM is the minimum height that the water level needs to be raised to in order for that particular cell to flood

TerraStream- Features Contour Map Computation Computes the contour map of a terrain

TerraStream- Features Raster Quality Assessment takes a raster and point cloud computes how far the center of each raster cell is from the closest point in the point cloud it is easy to spot areas of the grid where there is no points close If the point cloud used is the same used for generating the input raster this can be used for quality control of the point cloud, the classification algorithm used and the produced raster

TerraStream- Features Watershed Hierarchy Construction Construct a Pfafstetter labeling of the watersheds of a DEM LS-Factor Computation LS-factor: an aggregate of the slope length factor (L) and the slope steepness factor (S) estimate the effects of slope length and steepness on erosion Format Flexibility reading and writing mosaic grids in many common formats

TPIE- Introduction TPIE: The Templated Portable I/O Environment A tool-box providing efficient and convenient tools To ease the implementation of algorithm and data structures on very large sets of data The algorithms and data structures that form the core of TPIE all provide efficient worst-case space, time and disk usage guarantees In Windows, TPIE is known to work with the Microsoft Visual Studio 2008 and 2010 compilers

TPIE- Example Internal sorting

TPIE- Example Reading and writing file streams

TPIE- Example External sorting

TPIE- Example Priority queue

TPIE- I/O parameters M and B get_block_size() implementation

TPIE- I/O parameters Elements block size Pass the block factor to the constructor

The End Thank you for your time