Automated Netezza to Cloud Migration

Similar documents
Automated Netezza Migration to Big Data Open Source

Migrate from Netezza Workload Migration

Migrate from Netezza Workload Migration

Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP

Modern Data Warehouse The New Approach to Azure BI

Lambda Architecture for Batch and Stream Processing. October 2018

What is Gluent? The Gluent Data Platform

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Overview of Data Services and Streaming Data Solution with Azure

Making Data Integration Easy For Multiplatform Data Architectures With Diyotta 4.0. WEBINAR MAY 15 th, PM EST 10AM PST

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Modernizing Business Intelligence and Analytics

Virtuoso Infotech Pvt. Ltd.

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Security and Performance advances with Oracle Big Data SQL

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

STATE OF MODERN APPLICATIONS IN THE CLOUD

ADABAS & NATURAL 2050+

IBM Data Replication for Big Data

Energy Management with AWS

Capture Business Opportunities from Systems of Record and Systems of Innovation

Self-driving Datacenter: Analytics

Leverage the Oracle Data Integration Platform Inside Azure and Amazon Cloud

Oracle Big Data Connectors

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

What s New at AWS? A selection of some new stuff. Constantin Gonzalez, Principal Solutions Architect, Amazon Web Services

Hitachi Vantara Overview Pentaho 8.0 and 8.1 Roadmap. Pedro Alves

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools

@Pentaho #BigDataWebSeries

Ayush Ganeriwal Senior Principal Product Manager, Oracle. Benjamin Perez-Goytia Principal Solution Architect A-Team, Oracle

Accelerate Big Data Insights

Database Migration to the Cloud C L O U D A N A L Y T I C S D I G I T A L S E C U R I T Y

Database Migration to the Cloud CLOUD ANALYTICS DIGITAL INFRASTRUCTURE SECURITY

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

The Technology of the Business Data Lake. Appendix

Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility. AWS Whitepaper

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

Oracle GoldenGate for Big Data

WHITE PAPER HYBRID CLOUD: FLEXIBLE, SCALABLE, AND COST-EFFICIENT UK: US: HK:

Bringing Data to Life

BIG DATA COURSE CONTENT

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

IBM DB2 Analytics Accelerator use cases

Data Lake Based Systems that Work

Hybrid Data Platform

MapR Enterprise Hadoop

Accelerate Your Data Pipeline for Data Lake, Streaming and Cloud Architectures

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Data Analytics at Logitech Snowflake + Tableau = #Winning

Ensuring business continuity with comprehensive and cost-effective disaster recovery service.

REGULATORY REPORTING FOR FINANCIAL SERVICES

BI ENVIRONMENT PLANNING GUIDE

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.

Data Management Glossary

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

IBM Spectrum Protect Plus

TECHNOLOGY SOLUTION EVOLUTION

Simplifying your upgrade and consolidation to BW/4HANA. Pravin Gupta (Teklink International Inc.) Bhanu Gupta (Molex LLC)

EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud

WHITEPAPER. MemSQL Enterprise Feature List

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Scalable Tools - Part I Introduction to Scalable Tools

Cloud Analytics and Business Intelligence on AWS

Hadoop An Overview. - Socrates CCDH

Government IT Modernization and the Adoption of Hybrid Cloud

What s New at AWS? looking at just a few new things for Enterprise. Philipp Behre, Enterprise Solutions Architect, Amazon Web Services

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

Best Practices and Performance Tuning on Amazon Elastic MapReduce

IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store

SAP BW/4HANA the next generation Data Warehouse

Amazon CloudFront AWS Service Delivery Program Consulting Partner Validation Checklist

Store, Protect, Optimize Your Healthcare Data in AWS

Informatica Big Data Management on the AWS Cloud

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

AWS Agility + Splunk Visibility = Cloud Success. Splunk App for AWS Demo. Laura Ripans, AWS Alliance Manager

SQL Server 2017 Power your entire data estate from on-premises to cloud

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Cisco Tetration Analytics

Private Cloud Public Cloud Edge. Consistent Infrastructure & Consistent Operations

Data Architectures in Azure for Analytics & Big Data

USE CASE - HYBRID CLOUD IZO MANAGED CLOUD FOR AWS

USERS CONFERENCE Copyright 2016 OSIsoft, LLC

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Přehled novinek v SQL Server 2016

THE JOURNEY OVERVIEW THREE PHASES TO A SUCCESSFUL MIGRATION ADOPTION ACCENTURE IS 80% IN THE CLOUD

Data Center Management and Automation Strategic Briefing

Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems

BUILD BETTER MICROSOFT SQL SERVER SOLUTIONS Sales Conversation Card

Stages of Data Processing

Selecting the Right Method

Security & Compliance in the AWS Cloud. Amazon Web Services

Smart Data Catalog DATASHEET

ENTERPRISE-GRADE MANAGEMENT FOR OPENSTACK WITH RED HAT CLOUDFORMS

Cloud Computing & Visualization

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

Transcription:

Automated Netezza to Cloud Migration CASE STUDY

Client Overview Our client is a government-sponsored enterprise* that provides financial products and services to increase the availability and affordability of housing for all income groups in the U.S. Business Challenge As a leading source of financing for mortgage lenders they were expanding into secondary markets and would require a modern data lake architecture. However, they were faced with numerous business and technical challenges: *The client depicted in this case study is an actual client of Impetus and reserves the right to remain anonymous. High cost of ownership Consumption SLA were not met due to constrained EDW capacity Unbalanced positioning of workloads on Netezza Inability to process semi/ unstructured data 2 #

Approach For an end-to-end migration of workloads from Netezza to a modern Hadoop-based data lake architecture, the client needed to do the following: Assess and identify expensive/poor performing workloads and users Migrate historical data (one-time) and perform incremental updates Automatically translate Netezza queries into Spark SQL/Hive QL Balance the workloads effectively in the Hadoop space Recommend optimal storage formats for the data in the cloud Perform SCD Type 1 and Type 2 for slowly changing dimensions Refresh Netezza snapshots on AWS on a daily basis Recommend best practices for metadata, data lineage, and audits Perform a comprehensive business reconciliation for every iteration Reload data back to the EDW for reporting/analytical consumption Moved reporting/analytical consumption loads to scalable AWS vending layer 3

Solution Impetus designed a hybrid big data based cloud solution while accomplishing the following: Migrated historical data and developed a reliable mechanism for incremental updates Automatically translated consumption queries (i.e., SQL, Stored Procedures, Shell, Scheduler) into Spark SQL/Hive QL Ingested data from S3 to for reporting/analytics Replicated Netezza snapshotting process onto AWS Used Autosys and Oozie as on-premise/cloud orchestrators Assessed the existing workloads and provided recommendations to position them in Hadoop Used AWS Lambda as an eventdriven notification engine Used Amazon EMR to process vast amounts of data on cloud across scalable Amazon EC2 instances Used Amazon RDS to store metadata and Workflow info Evaluated and performed tool benchmarking for consumption queries in the cloud SOLUTION ARCHITECTURE BI/Tableau/MicroStrategy Hive, Hive LLAP S3 EMR Cloud Waterline Operational Metadata EM EM S3 Insight S3 Prep Snapshots Views SSE KMS Parquet/ ORC Hive Metastore Hive/ Spark 678" 9:8 Accelerators CEHL S3 Landing bzip2 Oozie Netezza Query Offload Business Reconciliation Lambda S3 Ingestion Framework Operational Metadata Multipart On-premise Java based data extractor EDW Failover Batch Cleanup IDS Autosys 4

Value Delivered Our comprehensive solution aligned our own best practices with our client s enterprise data lake goals. Helped the client to sunset Netezza Contained expenditures on aging technologies by freeing up capacity in Netezza Used Impetus accelerators to automagically translate Netezza logic Realized significant time and financial savings by reducing manual conversion efforts Provided on-site support and services to ensure 100% success Drove the innovation agenda by improving data availability across the enterprise To learn more about our Netezza migration solutions contact us at migration@impetus.com Impetus is focused on creating big business impact through Big Data Solutions for Fortune 1000 enterprises across multiple verticals. The company brings together a unique mix of software products, consulting services, Data Science capabilities, and technology expertise. It offers full life-cycle services for Big Data implementations and real-time streaming analytics, including. technology strategy, solution architecture, proof of concept, production implementation and on-going support to its clients. To learn more, visit www.impetus.com. 2018 Impetus Technologies, Inc. All rights reserved. Product and company names mentioned herein may be trademarks of their respective companies. 5