Real-Time & Big Data GIS: Best Practices. Josh Joyner Adam Mollenkopf

Similar documents
Real-Time & Big Data GIS: Best Practices. Suzanne Foss Josh Joyner

GeoEvent Server: An Introduction. Josh Joyner RJ Sunderman

ArcGIS GeoEvent Server Overview. Thomas Paschke

ArcGIS Enterprise: Architecture & Deployment. Anthony Myers

ArcGIS GeoEvent Server: Real-Time GIS

ArcGIS Enterprise: An Introduction. David Thom Solution Engineer State Government

GeoEvent Server: Introduction

GeoEvent Server Introduction

Real-Time & Big Data GIS: Leveraging the spatiotemporal big data store

ArcGIS Enterprise: An Introduction. Philip Heede

Data Store Management Best Practices. Bill Major Laurence Clinton

GeoEvent Server: An Introduction. Adam Ziegler, Solution Engineer

ArcGIS GeoEvent Server: Making 3D Scenes Come Alive with Real-Time Data

Monitoring Your Operations David Jacob

Working with Feature Layers. Russell Brennan Gary MacDougall

ArcGIS Enterprise: Architecting Your Deployment

ArcGIS Enterprise: Advanced Topics in Administration. Thomas Edghill & Moginraj Mohandas

Cloud Operations Using Microsoft Azure. Nikhil Shampur

ArcGIS GeoEvent Server REALTIME GIS. Jay Fowler Solution Engineer

Real-Time GIS: Applying Real-Time Analytics

What is new in ArcGIS 10.2.x for Server

Building Great Situational Awareness Apps Using ArcGIS Developer Tools. Kerry Robinson Eric Bader Thomas Solow

GeoEvent Server: Creating Connectors and Processors Using the GeoEvent SDK

Advances in GIS help create Smarter Communities

ArcGIS GeoEvent Server: Leveraging Stream Services

ArcGIS GeoEvent Server: Leveraging Stream Services. Ken Gorton RJ Sunderman

Real-Time GIS Leveraging Stream Services

ArcGIS GeoEvent Server: A Developer's Guide. Mark Bramer Esri Professional Services Vienna, VA

ArcGIS for Server: What s New. Philip Heede, Jay Theodore

VMware vcloud Air User's Guide

What s New in ArcGIS 10.3 for Server. Tom Shippee Esri Training Services

Deploying and Using ArcGIS Enterprise in the Cloud. Bill Major

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium

ArcGIS Enterprise Performance and Scalability Best Practices. Andrew Sakowicz

CIT 668: System Architecture. Amazon Web Services

Building Real Time Web Applications with GeoEvent Processor. Ken Gorton, Esri

Secure Block Storage (SBS) FAQ

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics

Real-Time Data and the Internet of Things (IoT) Adam Mollenkopf Real-Time & Big Data GIS Capability Lead, Esri

JANUARY Migrating standalone ArcGIS Server to ArcGIS Enterprise

Automating ArcGIS Deployments Using Chef

Raster Analysis and Image Processing in ArcGIS Enterprise

Introduction to Database Services

Real-Time GIS: Leveraging Stream Services

Building Real-Time Web Applications Using ArcGIS GeoEvent Processor

Deploy. A step-by-step guide to successfully deploying your new app with the FileMaker Platform

VOLTDB + HP VERTICA. page

An Introduction to GIS for developers

ArcGIS Enterprise: Cloud Operations using Amazon Web Services. Mark Carlson Cherry Lin

ArcGIS GeoEvent Extension for Server: Building Real-Time Web Apps

Accelerate critical decisions and optimize network use with distributed computing

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

PI Integrator for Esri ArcGIS: A Journey Through Time and Space

ArcGIS Server Architecture Considerations. Andrew Sakowicz

ArcGIS for Server Michele Lundeen

MapR Enterprise Hadoop

ArcGIS GeoEvent Processor for Server. Jay Hagen Esri Solution Engineer

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium

Quick Start ArcGIS Enterprise with Automation. Shannon Kalisky Mark Carlson Nikhil Shampur Cherry Lin

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

The Cloud's Cutting Edge: ArcGIS for Server Use Cases for Amazon Web Services. David Cordes David McGuire Jim Herries Sridhar Karra

CloudBerry Backup for Windows 5.9

Talkative Engage Mitel Architecture Guide. Version 1.0

ArcGIS for Server: Administration and Security. Amr Wahba

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Administering Your ArcGIS Enterprise Portal Bill Major Craig Cleveland

Real-Time GIS: GeoEvent Extension

FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM

ArcGIS 10.3 Server on Amazon Web Services

Database Level 100. Rohit Rahi November Copyright 2018, Oracle and/or its affiliates. All rights reserved.

DocAve 5 to DocAve 6 Upgrade

Edge for All Business

Introduction to ArcGIS Server Architecture and Services. Amr Wahba

Kubernetes Integration with Virtuozzo Storage

MySQL in the Cloud Tricks and Tradeoffs

ArcGIS Enterprise Administration

Scaling DreamFactory

POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN. PostgresConf US

Elastic Compute Service. Quick Start for Windows

Modernize Your Backup and DR Using Actifio in AWS

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics. 3. Charges and Billing

Data Analytics at Logitech Snowflake + Tableau = #Winning

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights

SCALE AND SECURE MOBILE / IOT MQTT TRAFFIC

Aurora, RDS, or On-Prem, Which is right for you

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

ArcGIS Enterprise: Portal Administration BILL MAJOR CRAIG CLEVELAND

Software and Migration Services FAQ for more information (available from Electronic Data Solutions ). Some implementation will be required, including

Essential Features of an Integration Solution

ArcGIS for Intelligence: Discern Activities of Interest Through Advanced Analysis. Natalie Feuerstein Ben Conklin Lyle Wright

Using SQL Server on Amazon Web Services

Qlik Sense Enterprise architecture and scalability

ArcGIS GeoEvent Processor for Server. Jayson Hagen & Bryan Franey

Veeam Backup & Replication on IBM Cloud Solution Architecture

UiPath Orchestrator Azure Installation

Accelerate Big Data Insights

Gladius Video Management Software

Home of Redis. Redis for Fast Data Ingest

System Requirements. Hardware and Virtual Appliance Requirements

Transcription:

Real-Time & Big Data GIS: Best Practices Josh Joyner Adam Mollenkopf

ArcGIS Enterprise with real-time capabilities Desktop Apps APIs live features stream services live & historic aggregates & features visualization ArcGIS Enterprise ingestion actuation ArcGIS GeoEvent Server analytics spatiotemporal big data store storage

Agenda: 2 3 4 5 6 7 Architecture Recommendations Big Data Storage Performance, Resiliency, & Scalability Stream Services Service Design Considerations Upgrade Planning Troubleshooting

Architecture Recommendations

GeoEvent Server What are the primary factors I should consider? Operating environment: m5.2xlarge - virtual machines beware! resources need to be shared in an effective way, like EC2 or Azure. - dedicated bare metal machines or public cloud instances are much more deterministic. Network - speed the faster the better. GBit/s Memory - size 8GB has been required since 0.3. 32GiB, default JVM max heap size is 4 GB - type minimum of DDR3 is recommended. - clock speed (MHz) and transfer rate (Mbps) the faster the better. Processors - # of cores the more the better. 8 vcpu - speed (GHz) the faster the better. Disk - 700MB required for installation 0GB recommended minimum (new for 0.6) - amount of disk space needed will vary based on quantity of deployed input connectors - each input can utilize up to a maximum of 600 MB of disk space before clean up

spatiotemporal big data store What are the primary factors I should consider? Operating environment: m5.2xlarge - virtual machines beware! resources need to be shared in an effective way, like EC2 or Azure. - dedicated bare metal machines or public cloud instances are much more deterministic. Disk - speed the faster the better,000 Mbps EBS, note: local SSD is much better Network - speed the faster the better. GBit/s Memory - size 6GB minimum. 32GiB, big data store allocates 8GiB by default - type DDR3 is recommended. - clock speed (MHz) and transfer rate (Mbps) the faster the better. Processors - # of cores the more the better. 8 vcpu - speed (GHz) the faster the better.

ArcGIS Enterprise with real-time capabilities ArcGIS Enterprise MINIMUM environment 3 machines IoT 2 GeoEvent Server 3 spatiotemporal big data store functional servers & spatiotemporal big data store SHOULD BE on ISOLATED machines

ArcGIS Enterprise with real-time capabilities ArcGIS Enterprise OPTIMIZED environment 5 machines IoT 2 3 4 5 GeoEvent Server spatiotemporal big data store functional servers & spatiotemporal big data store SHOULD BE on ISOLATED machines

ArcGIS Enterprise with real-time & big data GIS capabilities ArcGIS Enterprise MINIMUM environment 4 machines IoT 2 3 4 Big Data GeoEvent Server spatiotemporal big data store GeoAnalytics Server

ArcGIS Enterprise with real-time & big data GIS capabilities ArcGIS Enterprise OPTIMIZED environment 0 machines IoT 2 3 4 5 6 7 8 9 0 Big Data GeoEvent Server spatiotemporal big data store GeoAnalytics Server

ArcGIS Enterprise with real-time & big data GIS capabilities on Microsoft Azure ArcGIS Enterprise Cloud Builder for Microsoft Azure ArcGIS Enterprise IoT 2 3 4 5 6 7 8 9 0 Big Data GeoEvent Server spatiotemporal big data store GeoAnalytics Server

ArcGIS Enterprise with real-time & big data GIS capabilities on Amazon EC2 ArcGIS Enterprise ArcGIS Enterprise Cloud Builder for Amazon EC2 IoT 2 3 4 5 6 7 8 9 0 Big Data GeoEvent Server spatiotemporal big data store GeoAnalytics Server

2 Big Data Storage

ArcGIS Enterprise with real-time capabilities ArcGIS Enterprise OPTIMIZED environment 5 machines IoT 2 3 4 5 GeoEvent Server spatiotemporal big data store

spatiotemporal big data store shards & replication factor spatiotemporal big data store T node T3 GeoEvent Server T2 node 5 node 4 r = node 2 T3 node 3 T T2

spatiotemporal big data store auto-rebalancing of data upon node membership changes, + or -, in the big data store spatiotemporal big data store T x T3 GeoEvent Server T2 node 5 T3 T node 4 r = T3 node 3 node 2 T T T2

spatiotemporal big data store data retention policies, configured per data source spatiotemporal big data store node r = GeoEvent Server node 5 node 2 node 4 node 3 purge based on data retention

spatiotemporal big data store rolling index option, set appropriately to the velocity of your observation data spatiotemporal big data store node GeoEvent Server node 5 indices indices r = node 2 indices node 4 indices node 3 indices purge layer based on data retention policy

spatiotemporal big data store automatic data backups using periodic snapshots, including ability to restore from a snapshot spatiotemporal big data store node r = GeoEvent Server node 5 node 2 node 4 node 3 purge layer based on data retention policy snapshot-206-05-7--0-0.snapshot snapshot-206-05-7-2-0-0.snapshot

spatiotemporal big data store choosing an Object Id option

spatiotemporal big data store choosing an Object Id option Max Value # of IDs ArcGIS Clients Int32 2,47,483,647 2. billion Pro, Desktop, Ops Dashboard, IoT rate per second per day per week per month per year 86,400 604,800 2,592,000 3,04,000 0 864,000 6,048,000 25,920,000 3,040,000 00 8,640,000 60,480,000 259,200,000 3,0,400,000,000 86,400,000 604,800,000 2,592,000,000 3,04,000,000 0,000 864,000,000 6,048,000,000 25,920,000,000 3,040,000,000 00,000 8,640,000,000 60,480,000,000 259,200,000,000 3,0,400,000,000,000,000 86,400,000,000 604,800,000,000 2,592,000,000,000 3,04,000,000,000

spatiotemporal big data store choosing an Object Id option Max Value # of IDs ArcGIS Clients Int32 2,47,483,647 2. billion Pro, Desktop, Ops Dashboard, IoT rate per second per day per week per month per year 86,400 604,800 2,592,000 3,04,000 0 864,000 6,048,000 25,920,000 3,040,000 00 8,640,000 60,480,000 259,200,000 3,0,400,000,000 86,400,000 604,800,000 2,592,000,000 3,04,000,000 0,000 864,000,000 6,048,000,000 25,920,000,000 3,040,000,000 00,000 8,640,000,000 60,480,000,000 259,200,000,000 3,0,400,000,000,000,000 86,400,000,000 604,800,000,000 2,592,000,000,000 3,04,000,000,000 25 days 2.5 days 6 hours 36 minutes

spatiotemporal big data store choosing an Object Id option Max Value # of IDs ArcGIS Clients Int32 2,47,483,647 2. billion Pro, Desktop, Ops Dashboard, Int64 (signed) 9,223,372,036,854,775,807 9.2 quintillion JavaScript, custom apps IoT rate per second per day per week per month per year 86,400 604,800 2,592,000 3,04,000 0 864,000 6,048,000 25,920,000 3,040,000 00 8,640,000 60,480,000 259,200,000 3,0,400,000,000 86,400,000 604,800,000 2,592,000,000 3,04,000,000 0,000 864,000,000 6,048,000,000 25,920,000,000 3,040,000,000 00,000 8,640,000,000 60,480,000,000 259,200,000,000 3,0,400,000,000,000,000 86,400,000,000 604,800,000,000 2,592,000,000,000 3,04,000,000,000

spatiotemporal big data store choosing an Object Id option Max Value # of IDs ArcGIS Clients Int32 2,47,483,647 2. billion Pro, Desktop, Ops Dashboard, Int64 (signed) 9,223,372,036,854,775,807 9.2 quintillion JavaScript, custom apps UniqueStringID n/a unlimited JavaScript, custom apps IoT rate per second per day per week per month per year 86,400 604,800 2,592,000 3,04,000 0 864,000 6,048,000 25,920,000 3,040,000 00 8,640,000 60,480,000 259,200,000 3,0,400,000,000 86,400,000 604,800,000 2,592,000,000 3,04,000,000 0,000 864,000,000 6,048,000,000 25,920,000,000 3,040,000,000 00,000 8,640,000,000 60,480,000,000 259,200,000,000 3,0,400,000,000,000,000 86,400,000,000 604,800,000,000 2,592,000,000,000 3,04,000,000,000

3 Performance, Resiliency, & Scalability

Performance, Resiliency & Scalability throughput benchmarks Throughput has increased with every release of ArcGIS GeoEvent Server 0.6 release can support up to 6,000 events per second (e/s) ArcGIS GeoEvent Server 0.2 0.3 0.4 0.5 0.6 Velocity throughput measured in events per second (e/s) up to 500 e/s up to 2,000 e/s up to 3,000 e/s up to 4,000 e/s up to 6,000 e/s

Performance, Resiliency & Scalability throughput benchmarks Throughput has increased with every release of ArcGIS GeoEvent Server 0.6 release can support up to 6,000 events per second (e/s) 0.6. release can support up to 0,000 events per second (e/s) ArcGIS GeoEvent Server 0.2 0.3 0.4 0.5 0.6 0.6. Velocity throughput measured in events per second (e/s) up to 500 e/s up to 2,000 e/s up to 3,000 e/s up to 4,000 e/s up to 6,000 e/s up to 0,000 e/s

Performance, Resiliency & Scalability multi-machine site support ArcGIS 0.5 Resiliency (high availability) & scalability is only possible if users bring their own gateway. Barrier to entry is HIGH & typically requires a professional services engagement for success. Loses flexibility of input types. ArcGIS 0.6 Provides users with a resilient & scalable Real-Time GIS deployment OUT-OF-THE-BOX. Introduces a gateway process that is automatically configured as part of GeoEvent Server installation. Provides flexibility for all input types. ArcGIS GeoEvent Server 0.2 0.3 0.4 0.5 0.6 0.6. Velocity throughput measured in events per second (e/s) up to 500 e/s up to 2,000 e/s up to 3,000 e/s up to 4,000 e/s up to 6,000 e/s up to 0,000 e/s Resiliency & Scalability via multi-machine site no no no no yes yes

Performance, Resiliency & Scalability multi-machine site support Available Now: http://links.esri.com/geoevent-multiplemachine

Performance, Resiliency & Scalability factors that influence throughput input filter (spatial) filter (for input) buffer output 000 e/s geofences geometry inside Zones/.* 2000 e/s 000 e/s 000 e/s 2000 e/s input2 000 e/s geotagger geofences 000 e/s output2 geometry inside Zones/.* 000 e/s filter (for input2) 2000 e/s motion calculator 000 e/s output3 000 e/s 2000 e/s 4000 e/s 3000 e/s Input event counts don t always tell the whole story

Performance, Resiliency & Scalability configuration changes to support larger scale GeoEvent Server by default is only allocated 4GB of RAM for the JVM - If utilizing a large amount of GeoFences it may be necessary to increase this amount This can be modified through the /etc/arcgisgeoevent.cfg up to 32GB (JVM limitation)

Performance, Resiliency & Scalability configuration changes to support larger scale GeoEvent Server by default is able to maintain the state of 000 unique Track_IDs. - This value can be increased by editing /etc/com.esri.ges.manager.servicemanager.cfg You may also need to modify the Incident Manager Setting in the Global Setting Tab if used in conjunction with the Incident Detector Processor

4 Stream Services

Stream Service resilience & scalability 0.5 best practice = isolated deployment of GeoEvent Server (site per GeoEvent) An isolated deployment of GeoEvent instances leads to challenges with Stream Services: - Client A & B see event, while client C & D do not - Client C & D see event 2, while client A & B do not Gateway 2 2 2 2 2 bring-your-own own message gateway broker

Stream Service resilience & scalability 0.5 best practice = isolated deployment of GeoEvent Server (site per GeoEvent) GeoEvent instances input configuration use separate consumer groups: - With this configuration, all clients see all events Gateway CG = i CG = i bring-your-own own message gateway broker

Stream Service resilience & scalability 0.5 best practice = isolated deployment of GeoEvent Server (site per GeoEvent) GeoEvent instances input configuration use separate consumer groups: - With this configuration, all clients see all events Gateway CG = i2 2 2 2 2 2 CG = i2 2 bring-your-own own message gateway broker 2 2 2

Stream Service resilience & scalability 0.6 best practice = multi-machine site of GeoEvent Servers Gateway is provided out-of-the-box at 0.6 for ingress: - all clients see all events by default site Gateway 2 2 2 Gateway 2 2 2 2 bring-your-own own message gateway broker Gateway

Stream Service transparency 0.6 best practice = multi-machine site of GeoEvent Servers A reverse proxy can be configured in between the clients and the stream services so that clients don t have direct knowledge of the servers they are connecting to. - Example reverse proxies include NGiNX & Microsoft Application Request Routing (ARR). site Gateway Gateway or ARR bring-your-own own message gateway broker Gateway bring-your own reverse proxy

5 Service Design Considerations

Service Design Considerations which would you choose? Service A Service B

Service Design Considerations which would you choose? Each branch in a service contains the same event data. In this example, with three branches, it is creating 3X the volume of data. When possible, pre-filter the input data before ingesting.

Service Design Considerations not all components are created equally A Which of these services will process the fastest? Slowest? B C

Service Design Considerations not all components are created equally B C A The first service only contains components that are utilizing the internal service cache, which allows for the fastest processing.

Service Design Considerations not all components are created equally B C A The second service modifies the incoming event geometry which can be costly. These types of requests are typically very quick but can be impacted by geometry complexity.

Service Design Considerations not all components are created equally B C A The third service utilizes Network Analyst to return a drive time polygon which can significantly impact throughput.

Service Design Considerations other recommendations Configure Filters and/or Field Reducer Processors as early as possible in a service - This reduces the volume / data size of the events being processed - Potentially simplifies service configuration down stream Avoid Managed GeoEvent Definitions when possible - These are system owned definitions whose lifecycle is entirely controlled by the processors - Editing or Deleting a processor will remove these definitions - If necessary copy generated definition and edit processor to look for it Utilize the combination of Imported Definitions and Field Mapper Processor for Feature Service Outputs - This ensures that all of the event data is being written in the correct format - Can also be used to update only a portion of the fields

6 Upgrade Planning

Upgrade Planning what should be consider In-place Upgrade vs Clean Installation - When possible do a clean install - GeoEvent Server install and uninstalls very quickly Export Configuration & Global Settings from within GeoEvent Manager - Use time to remove any unused definitions or components Backup any configuration files that were modified in /etc folder Copy contents of /deploy folder (custom components) Delete contents of old site configuration (e.g. C:\arcgisserver\local\zookeeper) Install new version and import configurations

7 Troubleshooting

Troubleshooting inputs I can t get my data to come in Check the definition - Most input problems are with misconfigured schema (field names, data types, group structure) - Try letting GeoEvent Server create definition for you - but you will likely need to edit the definition and edit the input to use the one you modified

Troubleshooting outputs I can t write my data to Check the definition - Most output problems are with misconfigured schema (field names, data types, group structure) - If possible import the definition from the service you are trying to write to - If an Esri Feature Service remove the reserved field names (e.g. ObjectID / OID) - Use the GeoEvent Logger application to verify the expected output data

Troubleshooting backup Everything was working yesterday but or Someone accidently deleted Did you make a backup of your configuration? - With ArcGIS GeoEvent Server 0.5 or newer, we did for you

Help us improve the Real-Time & Big Data GIS Capabilities http://esriurl.com/realtimesurvey

Questions / Feedback? Adam Mollenkopf Real-Time & Big Data GIS Capability Lead amollenkopf@esri.com @amollenkopf Josh Joyner ArcGIS GeoEvent Server Product Manager jjoyner@esri.com