Managing & Analyzing High Volume of Data? It's Simpler Than You Think!

Similar documents
THINK RESILIENCY 2.0 WITH VIZONOMY

Virtuoso Infotech Pvt. Ltd.

Data Management Glossary

Fast Innovation requires Fast IT

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

The Definitive Guide to Preparing Your Data for Tableau

Composite Software Data Virtualization The Five Most Popular Uses of Data Virtualization

IBM dashdb Local. Using a software-defined environment in a private cloud to enable hybrid data warehousing. Evolving the data warehouse

Hadoop, Yarn and Beyond

Hackathon Product Spec Sheet

The Emerging Data Lake IT Strategy

ETL is No Longer King, Long Live SDD

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

SAP HANA Spatial Location-based business platform

Flood Control Structures Sometimes Neither Fish Nor Fowl

Doug Couto Texas A&M Transportation Technology Conference 2017 College Station, Texas May 4, 2017

The Data Catalog The Key to Managing Data, Big and Small. April Reeve May

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture

Mid-Market Data Center Purchasing Drivers, Priorities and Barriers

Geospatial Enterprise Search. June

Embedded Technosolutions

Aviation & Airspace Solutions MODERNIZING SYSTEMS TRANSFORMING OPERATIONS DELIVERING PERFORMANCE

Introduction to Data Science

Applying Auto-Data Classification Techniques for Large Data Sets

Importance of the Data Management process in setting up the GDPR within a company CREOBIS

Strategic Briefing Paper Big Data

Flash Storage Complementing a Data Lake for Real-Time Insight

INFRASTRUCTURE. A Smart Strategy Global Water Asset Management Lead, Ove Arup NYC FORUM -

The Office of Infrastructure Protection

Selling Improved Testing

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

A data-driven framework for archiving and exploring social media data

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

Unlocking Operational Intelligence from the Data Lake

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.

Lambda Architecture for Batch and Stream Processing. October 2018

Considerations of Analysis of Healthcare Claims Data

GOVERNMENT IT: FOCUSING ON 5 TECHNOLOGY PRIORITIES

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

SIEM: Five Requirements that Solve the Bigger Business Issues

ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V

Cloud Going Mainstream All Are Trying, Some Are Benefiting; Few Are Maximizing Value. An IDC InfoBrief, sponsored by Cisco September 2016

Virtustream Managed Services Drive value from technology investments through IT management solutions. Tim Calahan, Manager Managed Services

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Progress DataDirect For Business Intelligence And Analytics Vendors

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Intelligent Enterprise meets Science of Where. Anand Raisinghani Head Platform & Data Management SAP India 10 September, 2018

Massive Scalability With InterSystems IRIS Data Platform

Resilient Chicago & A Flood Vulnerability Assessment for Critical Facilities

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

Spotfire and Tableau Positioning. Summary

Big Data Analytics. Description:

Lidar Talking Points Status of lidar collection in Pennsylvania Estimated cost and timeline

DISASTER RISK REDUCTION (DRR) AMBASSADOR CURRICULUM

Transforming IT: From Silos To Services

Securing Your Digital Transformation

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

12 Minute Guide to Archival Search

Disaster Mitigation Projects in Polis. December 14, 2016

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Modernizing Business Intelligence and Analytics

INtelligent solutions 2ward the Development of Railway Energy and Asset Management Systems in Europe.

What s New at AWS? looking at just a few new things for Enterprise. Philipp Behre, Enterprise Solutions Architect, Amazon Web Services

Big Data on AWS. Peter-Mark Verwoerd Solutions Architect

WHITE PAPER. The General Data Protection Regulation: What Title It Means and How SAS Data Management Can Help

Modern Data Warehouse The New Approach to Azure BI


Financing resilience at the subnational 22 JUNE 2014 STANDING COMMITTEE ON FINANCE FORUM, MONTEGO BAY, JAMAICA

Day 1 summary for day 2. Addison, TX

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management

Data Stewardship Core by Maria C Villar and Dave Wells

Chapter 6 VIDEO CASES

Solving the Really Big Tech Problems with IoT Data Security and Privacy

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Land Administration and Management: Big Data, Fast Data, Semantics, Graph Databases, Security, Collaboration, Open Source, Shareable Information

Test bank for accounting information systems 1st edition by richardson chang and smith

Building a Data Strategy for a Digital World

Strategic Security Analyst

GEOMEDIA MOTION VIDEO ANALYSIS PROFESSIONAL

TrajAnalytics: A software system for visual analysis of urban trajectory data

An InterSystems Guide to the Data Galaxy. Benjamin De Boe Product Manager

Evolving Corporate Networks and the Business Case for SD-WAN

PERSPECTIVE. Data Virtualization A Potential Antidote for Big Data Growing Pains. Abstract

Applying Mitigation. to Build Resilient Communities

Outreach and Mitigation: A Convenient Partnership in Loss Reduction. Scott Schelling and Carver Struve, CFM

Advanced Solutions of Microsoft SharePoint Server 2013 Course Contact Hours

Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems

Exploring the Nuxeo REST API

Top 25 Big Data Interview Questions And Answers

TREND REPORT. Ethernet, MPLS and SD-WAN: What Decision Makers Really Think

Advanced Solutions of Microsoft SharePoint 2013

Introduction to Big-Data

Supporting the Cloud Transformation of Agencies across the Public Sector

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

We make hybrid cloud deliver the business outcomes you require

Managing Data Resources

The Value of Metadata

GOVERNING HADOOP (AND THE DATA LAKE)

Designing High-Performance Data Structures for MongoDB

Transcription:

Managing & Analyzing High Volume of Data? It's Simpler Than You Think! Suman Biswas NiyamIT Inc. IT Solution Architect Andrew Ditmore IBM Program Manager Risk MAP CDS ASFPM 2016

The Usual Challenge In Hand Enterprise/organization implementing solutions that collect enormous amount of data Decision makers need to answer key questions in a timely fashion and accurately Analysts are confused: Which data set to use What s the source of right data elements How to ensure data is correct How to put everything together (relationships between data elements) to get to the answer 1

Let s Adopt A Disciplined Approach Managing and Analyzing Data Three Focus Areas DATA Layer 1 ANALYTICS Layer 2 RESULTS Layer 3 BEST PRACTICES METHODOLOGIES FRAMEWORK 2

It Begins With Data Data Governance Data Management Data Quality DATA VOLUME e.g. Flood Disaster Study - 1 sq. mile area could generate ~8 TB od data VELOCITY Data production will be 44 times greater in 2020 than it was in 2009 VARIETY Structured/Unstructured data weather sensors, social media feed etc. VERACITY 3 Same data in multiple places data accuracy?

Data : Key Aspects To Consider Master Data Management Identify critical data elements Have a single point of reference remove duplicates Standardize data and incorporate rules to eliminate incorrect data from entering the system in order to create an authoritative source of master data Sometimes, an organization may need to acquire new data elements to answer questions effectively Knowledge of the domain is much more important than technical skills 4

The Rewarding Power Of Data ANALYTICS e.g. Analysis indicates impending catastrophic event 5 Gartner Chart

Analytics: Key Considerations Adopt off-the-shelf software or develop custom queries/products SAP Business Objects, IBM Cognos, etc. Build models that predict and optimize business outcomes Effective approach to build a model starts with business needs The process to have an effective model may take a few iterations Decide when to have real-time, near real-time or batch data Understand your IT infrastructure - physical locations of your data sources Analysts & IT staff both play a pivotal role here 6 Keep Big Data factor in mind

Let s Take Action Reduce/Mitigate Risk Better Decisions Better Processes RESULTS 7

Results: Key Considerations Adopt a off-the-shelf software or develop custom solution/dashboard Tableau, Qlick etc. GIS-based solution is key within FEMA Keep non-it managers and decision makers in mind as primary stakeholders Results/reports should be light enough for mobile devices Data security is key here too not everybody should see all the information 8

Putting It All Together 9

Five Key Trends on Data Management & Analytics o Analytics standardization is waning o No longer a single solution organizations are having multiple solutions in terms of Data Management, Analytics (Business Intelligence) o Cloud-based data warehousing is on the rise o Cloud-based data warehousing services show the biggest increase in adoption of any information-management category o Real-time technology is seeing real gains o Apache Spark, Apache Storm, Splunk, and other options are bringing realtime analysis to system, service, network and mobile scenarios o Hadoop and NoSQL adoptions are growing o Factors that are driving NoSQL: faster, more flexible development than achievable with relational databases o Data quality concerns are easing o Data is messy -- always has been, always will be. Big-data platforms like Hadoop and NoSQL databases accommodate 10 messiness

FEMA Risk MAP A Use Case Coastal Study Data The Federal Emergency Management Agency along with a multitude of partners and communities work in cooperation to collect the coastal data needed to produce accurate flood studies It is important to provide transparency to the community and enable them to evaluate the technical approach and the science behind the study In order to update coastal Flood Insurance Rate Maps (FIRMs), data collection is generally conducted along the entire state or county coastline or across an entire lake basin (i.e., for the Great Lakes) In addition to providing more accurate flood risk information, these studies make efficient use of limited tax dollars by generating new data that can be used to support other studies on coastal erosion, land use planning, estuaries/wildlife habitat and other coastal topics Under the Risk MAP Program, additional "non-regulatory products may also be included as data to help communities and property owners plan for resilience 11

Coastal Study Data Management What s The Issue? Currently, this data is stored in various locations, and in some cases may be difficult to obtain since it may have stayed with the partner contracted to collect the data Another challenge has been that this sort of data is typically very large, and in the past this has been a prohibitive factor to creating a single coastal data storage solution This data must be presented in a way that can be easily understood, searchable and accessible The ease of being able to search, access, and obtain the correct coastal data will save time in these future efforts Apply GIS based analytics to help users discover the data in a more meaningful way Initial Data Volume: 260+ TB Same Issue for LiDAR Data Sets 12

Coastal Study Data Management Big Data Upload/Download TCP Based File Upload/Download options have limitations Implemented FASP based solution (UDP) Aspera Software Comparison between traditional HTTP vs. Aspera solution A meaningful structure for the files to be stored Deep down folder tree structure could have > 20 sub-folders Millions of files Validated metadata accompanies each 13 project data upload

Data Is Available In One Place Need to Search & Retrieve RC Libraries (ILS) Solution Local Data Tactical Data Geospatial Libraries File Servers Legacy Systems Image Libraries Solution provides 14 Single system to - store, index, catalog, search, discover, display, acquire GEOINT and Multi-INT Simultaneously search across local and multiple repositories OpenSource Based Solution - ElasticSearch Support for over 200 data formats including, geospatial imagery, and NGA standard products Powerful natural language and full text search Records management functionality User defined automated search notifications (standing orders) Discoverable with Z39.50 and CSW/OGC

Risk MAP Search Engine 15

Risk MAP Search Engine (contd.) 16

Search & Retrieve Process Technical Background Document oriented NoSQL Partitioned Meta Data Store Model, query View 1 Model, query View 1 Model, query View 1 Client uses query function to obtain results and use case ID to access relevant data Query Function Client Existing Data Store 17

Search & Retrieve Process High Level View 18

Data Dissemination - Results Tableau software was chosen for prototype although many options exist Focused on: Easy to read and understand data Navigable by map/geography Putting various measures sideby-side 19

Data Dissemination Results (contd.) Again, selecting an alternate data set from the Table of Contents changes the data visualized but keeps the geography of interest The data source stays the same but the tables and maps are updated to show at the new level Much like viewing a Region showed a state breakdown, viewing a State shows a County breakdown 20

Q&A THANK YOU! 21