HANA & Hadoop SAP FORUM. Javier Fernandez Leon February 2016

Size: px
Start display at page:

Download "HANA & Hadoop SAP FORUM. Javier Fernandez Leon February 2016"

Transcription

1 Rumbo 2020 SAP FORUM HANA & Hadoop Javier Fernandez Leon February 2016 FTS INTERNAL

2 Rumbo 2020 HANA & HADOOP Intro INDICE Challenges of distributed Big Data What is Apache Hadoop? Features Comparison HANA vs Hadoop HANA & Apache Spark HANA & Hadoop combined. Scenarios Uses Cases HANA & Hadoop Managed Service Pay per use model for HANA & Hadoop FTS INTERNAL Copyright 2014 LIMITED

3 Challenges of distributed Big Data WE ARE DROWING IN OUR OWN DATA Inefficient Data Processing Real-time drill-down interaction is impossible when data is distributed across thousands of nodes and processed in batches Lack of Business Alignment Need to align business decisions to changing external market conditions by processing data in business systems with Hadoop Data Lakes together. Costly Management of Big Data Extensive amounts of data start clogging business systems with data that can be more efficiently archived to less expensive systems 2

4 Gap between the Enterprise & Big Data Frameworks WE ARE DROWING IN OUR OWN DATA Complexity Performance Enterprise Core Systems Unable to work together. Big Data Frameworks & Tools Objetives : Standarize, simplify and Automate both worlds. 3

5 What is Apache Hadoop? HADOOP APACHE HADOOP is open source software that enables reliable, scalable, distributed computing on clusters of inexpensive servers RELIABLE : Software is fault tolerant, it expects and handles HW and SW failures SCALABLE : designed for massive scale of processors, memory and local attached storage. Petabytes DISTRIBUTED : Handles replication. Offers massively parallel programming model, MapReduce 4

6 Hadoop Logical Components HADOOP 5

7 What does Hadoop bring to the Table? HADOOP Cost efficient data storage and processing for large volumes of structured, semi-structured and unstructured data such as web logs, machine data, text data, call data records, audio, video data. BATCH PROCESSING Where fast response times are less critical than reliability ad scalability COMPLEX INFORMATION PROCESSING: Enable heavily recursive algorithms, machine learning & queries that cannot be easily expressed in SQL LOW VALUE DATA ARCHIVE: Data stays available, though access is slower. Scale up to Petabytes POST-HOC ANALYSIS: Mine raw data that is either schema-less or where schema changes over time 6

8 Who uses Hadoop? HADOOP FACEBOOK Facebook runs the world s largest Hadoop cluster. Just one of several Hadoop clusters operated by the company spans more than 4,000 machines, and houses over 100 petabytes of data YAHOO Yahoo runs Hadoop on 42,000 servers--that's 1,200 racks--in four data centers. Its largest Hadoop Cluster was 4000 nodes. TWITTER Twitter uses Hadoop for product analysis, social graph analysis, generating indices for people search, natural language processing and many other applications Facebook messaging (Hbase) and generate reports for advertisers who need to track effectiveness of campaign Use it for indexing of web crawl results 7

9 Comparison Hadoop & HANA HADOOP & HANA HADOOP SAP HANA Data Architecture Unstructured data and files on disk Structured data in memory Data Structures No predefined schema Predefined schema & models Performance Very slow data access (seconds to hours) Very fast access (~<1 ms) Scalability Scale-out to thousands of low cost servers Scale up/ Scale-out to many server Data Consistency BASE ( Basic availability, soft state, eventual consistency) ACID ( Atomicity, Consistency, Isolation, Durability) Licensing costs Free Open Source or commercial distros Many options: cloud, enterprise OLTP No OLTP Excellent OLTP OLAP Slow OLAP Excellent OLAP Server Fail Over Query & Server Fail Over Server Failover Enterprise Admin Tools Small Excellent 8

10 Combination of HANA & Hadoop HADOOP & HANA SAP HANA = Instant results HADOOP = Infinite storage + Raw Data SAP & Hadoop = Instant access + Infinite scale 9

11 Connection to HANA SMART DATA ACCESS ( SDA) Benefits Enables access to remote data access just like local table Smart query processing including query decomposition with predicate push-down, functional compensation Supports data location agnostic development No special syntax to access heterogeneous data sources Not restricted only to Hadoop Heterogeneous data sources Oracle, MS SQL, Teradata, DB2, Netezza Hadoop Hive, vudf, Spark SAP HANA (BWoH, SoH) SAP Sybase ASE, IQ, MaxDB SAP Sybase ESP, SQLA 10

12 Example of scenario for bringing both worlds - POS SCENARIO HADOOP - HANA 11

13 Spark APACHE SPARK VERY fast in-memory, data-processing framework like lightning fast. 100x faster than Hadoop fast Unlike Hadoop, supports batch and steaming Analysis --> Single Framework for batch and near real time use cases Spark requires a 1)Cluster Management :standalone, Hadoop YARN, Apache. 2) Distributed Storage System : supports HDFS, Cassandra, Openstack Swift, Amazon S3 - All Hadoop connectors can be leveraged in Spark If you are going to start with Hadoop now, you should do it with Spark 12

14 SAP HANA Vora WHAT IS INSIDE? HANA Vora is an in-memory query engine which leverages and extends the Apache Spark execution framework to provide enriched interactive analytics on Hadoop. HANA Spark Adapter for improved performance between distributed systems Compiled queries enable applications & data analysis to work more efficiently across nodes Familiar OLAP experience on Hadoop to derive Business Insights from Big Data such as drill-down into HFDS data Integration of SAP data with data Lakes HANA connectivity on Hadoop Enterprise Analytics(hierarchies) & Interactive SQL on Hadoop data Data Tiering from HANA to Hadoop for OLAP scenarios using DLM Archiving of ERP data using ILM to Hadoop 13

15 SAP HANA Vora USE CASE : IoT for a Turbine Sensors stream data continuously Sensors typically structured in a Hierarchy Information regarding Hierarchy are typically stored on ERP System Information important for error detection: two sensors ROLE OF HANA VORA Providing OLAP capabilities - Joining Hierachy with IoT Data Bridges gap between Enterprise systems and cluster : BOM of turbine easily accesible Performance of in-memory computing: On both Enterprise & Cluster processing 14

16 Key Scenarios INTERNAL USE ONLY 15 Copyright 2014 LIMITED Copyright 2014 LIMITED

17 Key Scenarios Example of Scenarios Flexible data store Using Hadoop as a flexible store of data captured from multiple sources, including SAP and non-sap software, enterprise software, and externally sourced data Simple database Using Hadoop as a simple database for storing and retrieving data in very large data sets Processing engine Using the computation engine in Hadoop to execute business logic or some other process Data analytics Mining data held in Hadoop for business intelligence and analytics 16

18 Key Scenarios - Architecture EXAMPLE OF USE SCENARIOS 17

19 Key Scenarios Hadoop as Flexible Data Store EXAMPLE OF USE SCENARIOS SCENARIO DESCRIPTION SAMPLE USE CASES Social Media Data Stream Capture Data Archive OLTP Transaction Data Real-time capture of data from social media sites, especially of unstructured Text Real-time capture of high volume, rapidly arriving data streams Capture of archive logs that would otherwise be sent to off-line storage Long-term persistence of transactional data from historical online transaction processing (OLTP) Comments on products on Twitter, Facebook, and Amazon Smart meters, factory floor machines, real time web logs, sensors in vehicles Archive Data or computer systems logs Call center, inventory.. COMMENT Combine social media data with other data, for CRM data or product data, in real time to gain insight. Lower costs when compared with conventional solutions 18

20 Key Scenarios Hadoop as Flexible Data Store EXAMPLE OF USE SCENARIOS SCENARIO DESCRIPTION SAMPLE USE CASES Reference Data Copy of existing large reference data sets Census surveys, GIS, large industry specific data sets, weather measurement and tracking systems Store reference data alongside other data in one place to make it easier to combine for analytic purposes histories Capture logs of correspondence a company sends and recevives Fulfillment of legal requirements for persistence and for use in analytics Combine data from with other data to support, for example, risk management Document & Multmedia Storage Capture of business documents generated and received by business. BLOBS Healthcare, insurance and other businesses that generate or use large volumes of documents that must be kept for extended periords Store unlimited number of documents in Hadoop, for example, using HBAse 19

21 Key Scenarios Hadoop as Processing Engine EXAMPLE OF USE SCENARIOS Use Hadoop as a data processing engine for ETL rationalization to feed SAP HANA MapReduce Programs execute process logic Pig for data analysis Mahout for data mining and machine learning Replicate master data to hadoop for data processing Feed results to SAP HANA with Data Services and merge with conformed model 20

22 Key Scenarios Hadoop as Processing Engine EXAMPLE OF USE SCENARIOS SCENARIO DESCRIPTION SAMPLE USE CASES COMMENT ETL Rationalization Low-latency ingestion of data from operational systems Tiered storage: High-value data loaded and transformed in HANA in parallel, offload preprocessing to hadoop Identify differences Differences in large, but similar sets of data DNA Analysis Hadoop using Mapreduce Risk Analysis Look for known patterns in data in Hadoop that suggest risky behavior Risk in credit cards; Rogue traders Da Data Cleansing and enrichment Fix data issues. Enhance with additional information Add demographic or other data to, for example, customer Web logs Data Mining Look for patterns, data clusters, and correlations in Hadoop Analyze machine data to predict Correlate customer behaviour Require Mahout 21

23 Key Scenarios Hadoop & HANA for Analytics EXAMPLE OF USE SCENARIOS Hadoop storage is sometimes so high that can t be replicated into SAP HANA in a cost effective or timely manner Some of the analysis must be done in Hadoop as well as SAP HANA Hadoop queries require longer processing times that SAP HANA Analysis will likely require combining data from Hadoop, SAP HANA and other sources Two approaches: Two-Phase Analytics : run analysis continually o Hadoop, then periodic updates to SAP HANA for fast interactive query response Federated Queries: Split analysis into parts and run async on Hadoop & SAP HANA Federate results in SAP HANA or BI 22

24 Key Scenarios Hadoop & HANA for Analytics EXAMPLE OF USE SCENARIOS Two-Phase Analytics 23

25 Key Scenarios Hadoop & HANA for Analytics EXAMPLE OF USE SCENARIOS Federated Queries 24

26 Use Cases - Healthcare USE CASES 25

27 Use Cases - Healthcare EXAMPLE 26

28 Use Cases Predictive Maintenance EXAMPLE OF USE SCENARIOS Business Challenges A computer server manufacturer wants to implement effective preventative maintenance by identifying problems as they arise then take prompt action to prevent the problem occurring at other customer sites Technical Challenges Identifying problems by analyzing text data from call centers, customer questionnaires together with server logs generated by their hardware Combining results with CRM, sales and manufacturing data to predict which servers are ikely to have problems in the future Solution Use SAP Data Services to analyze call center data and questionnaires stored in Hadoop and identify potential problems Use HANA to merge results from Hadoop with server logs to identify indicators in those logs of potential problems Combine with CRM, bill of material and production/manufacturing data to identify cases where preventative maintenance would help 27

29 Pay per use Models for HANA & Hadoop Intel Inside. Powerful Solution Outside. INTERNAL USE ONLY 28 Powered by Intel Xeon processor. Copyright 2014 LIMITED Copyright LIMITED

30 Modelo de Servicio definido por 5 parámetros EJEMPLO: Sistema SAP ERP 6.0 de PRODUCCIÓN 5 parámetros standard definen el servicio SAP Cualitativos Availability class Disaster-recovery class 99.5% DR, local HA,. Cuantitativos Managed operations 24 7 Managed performance Dialog response time 90% < 1 sec. Additional Certification(s) ISAE3402 (SOX), SAS70 Estos parámetros reflejan los SLAs!!!! Estos parámetros reflejan el uso!!!! 29

31 SLAs verificables desde SAP Las transacciones representanla utilización real del sistema SAP y están vinculadas al negocio 30

32 Y qué pasa con SAP HANA? 31

33 HANA en Cloud en modo pago por uso - vhana vhana CLOUD SERVICIOS INCLUÍDOS PAGO MENSUAL EN FUNCIÓN DE LA MEMORIA CONSUMIDA EN HANA 32

34 Hadoop in Pay Per Use based on Openstack Service Governance (Service Desk, Service-Management) Hadoop Integration with SAP HANA (Administration, Connectivity ) Level 5 HADOOP PLATFORM Services (Administration/Monitoring, Backup- & Recovery, patches, upgrades ) OPENSTACK System Services (Administration/Monitoring, patches, upgrades...) OPENSTACK FRAMEWORK (Ceph, Neutron, Nova. Heat.) Data Center and Network Services (Administration Monitoring, Capacity-Management) Level 4 Level 3 Level 2 Level 1 33

35 Hadoop in Pay Per Use based on Openstack HADOOP CLOUD SERVICIOS INCLUÍDOS PAGO MENSUAL SERVICIO GESTONADO EN FUNCIÓN DE LA MEMORIA/CPU/ CONSUMIDA POR HADOOP 34

36 Take Aways 35

37 Summary TAKE AWAYS Hadoop excels at very high-scale, low-cost/tb and data type flexibility SAP HANA excels at speed and structure, plus is fully integrated with Business Suite Enterprise Logic Leverage strenghs of both platforms in data store, data processing and analytics scenarios Carefully evaluate your requirements and use case against these scenarios If you are about to start with Hadoop, use Apache Spark & Vora Both can be deployed in a simple, pay per use model by Fujitsu 36

38 37

39 Rumbo 2020 FTS INTERNAL

2013 SAP AG or an SAP ailiate company. All rights reserved. CIO Guide. SAP Solutions. How to Use Hadoop with Your SAP Software Landscape

2013 SAP AG or an SAP ailiate company. All rights reserved. CIO Guide. SAP Solutions. How to Use Hadoop with Your SAP Software Landscape SAP Solutions CIO Guide How to Use with Your SAP Software Landscape February 2013 Table of Contents 3 Executive Summary 4 Introduction and Scope 6 Big Data: A Deinition A Conventional Disk-Based RDBMs

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems

Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems 1 Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems The Defacto Choice For Convergence 2 ABSTRACT & SPEAKER BIO Dealing with enormous data growth is a key challenge for

More information

Capture Business Opportunities from Systems of Record and Systems of Innovation

Capture Business Opportunities from Systems of Record and Systems of Innovation Capture Business Opportunities from Systems of Record and Systems of Innovation Amit Satoor, SAP March Hartz, SAP PUBLIC Big Data transformation powers digital innovation system Relevant nuggets of information

More information

Orchestration of Data Lakes BigData Analytics and Integration. Sarma Sishta Brice Lambelet

Orchestration of Data Lakes BigData Analytics and Integration. Sarma Sishta Brice Lambelet Orchestration of Data Lakes BigData Analytics and Integration Sarma Sishta Brice Lambelet Introduction The Five Megatrends Driving Our Digitized World And Their Implications for Distributed Big Data Management

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

USERS CONFERENCE Copyright 2016 OSIsoft, LLC

USERS CONFERENCE Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time

More information

Data Warehousing in the Age of In-Memory Computing and Real-Time Analytics. Erich Schneider, Daniel Rutschmann June 2014

Data Warehousing in the Age of In-Memory Computing and Real-Time Analytics. Erich Schneider, Daniel Rutschmann June 2014 Data Warehousing in the Age of In-Memory Computing and Real-Time Analytics Erich Schneider, Daniel Rutschmann June 2014 Disclaimer This presentation outlines our general product direction and should not

More information

Introduction to Big-Data

Introduction to Big-Data Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,

More information

Webinar Series TMIP VISION

Webinar Series TMIP VISION Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

Microsoft Big Data and Hadoop

Microsoft Big Data and Hadoop Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools SAP Technical Brief Data Warehousing SAP HANA Data Warehousing Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools A data warehouse for the modern age Data warehouses have been

More information

Flash Storage Complementing a Data Lake for Real-Time Insight

Flash Storage Complementing a Data Lake for Real-Time Insight Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum

More information

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,

More information

Data Analytics at Logitech Snowflake + Tableau = #Winning

Data Analytics at Logitech Snowflake + Tableau = #Winning Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief

More information

Hadoop An Overview. - Socrates CCDH

Hadoop An Overview. - Socrates CCDH Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected

More information

Oracle GoldenGate for Big Data

Oracle GoldenGate for Big Data Oracle GoldenGate for Big Data The Oracle GoldenGate for Big Data 12c product streams transactional data into big data systems in real time, without impacting the performance of source systems. It streamlines

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information

Embedded Technosolutions

Embedded Technosolutions Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication

More information

Building a Data Strategy for a Digital World

Building a Data Strategy for a Digital World Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service

More information

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018 Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/

More information

Data Lake Based Systems that Work

Data Lake Based Systems that Work Data Lake Based Systems that Work There are many article and blogs about what works and what does not work when trying to build out a data lake and reporting system. At DesignMind, we have developed a

More information

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice 2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data

More information

Lambda Architecture for Batch and Stream Processing. October 2018

Lambda Architecture for Batch and Stream Processing. October 2018 Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.

More information

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications

More information

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera, How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype? Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Analyze Big Data Faster and Store It Cheaper

Analyze Big Data Faster and Store It Cheaper Analyze Big Data Faster and Store It Cheaper Dr. Steve Pratt, CenterPoint Russell Hull, SAP Public About CenterPoint Energy, Inc. Publicly traded on New York Stock Exchange Headquartered in Houston, Texas

More information

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Keywords: Big Data, Oracle Big Data Appliance, Hadoop, NoSQL, Oracle

More information

New Approaches to Big Data Processing and Analytics

New Approaches to Big Data Processing and Analytics New Approaches to Big Data Processing and Analytics Contributing authors: David Floyer, David Vellante Original publication date: February 12, 2013 There are number of approaches to processing and analyzing

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Chase Wu New Jersey Institute of Technology

Chase Wu New Jersey Institute of Technology CS 644: Introduction to Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Institute of Technology Some of the slides were provided through the courtesy of Dr. Ching-Yung Lin at Columbia

More information

A Single Source of Truth

A Single Source of Truth A Single Source of Truth is it the mythical creature of data management? In the world of data management, a single source of truth is a fully trusted data source the ultimate authority for the particular

More information

Acquiring Big Data to Realize Business Value

Acquiring Big Data to Realize Business Value Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways

More information

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018 Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning

More information

The age of Big Data Big Data for Oracle Database Professionals

The age of Big Data Big Data for Oracle Database Professionals The age of Big Data Big Data for Oracle Database Professionals Oracle OpenWorld 2017 #OOW17 SessionID: SUN5698 Tom S. Reddy tom.reddy@datareddy.com About the Speaker COLLABORATE & OpenWorld Speaker IOUG

More information

@Pentaho #BigDataWebSeries

@Pentaho #BigDataWebSeries Enterprise Data Warehouse Optimization with Hadoop Big Data @Pentaho #BigDataWebSeries Your Hosts Today Dave Henry SVP Enterprise Solutions Davy Nys VP EMEA & APAC 2 Source/copyright: The Human Face of

More information

<Insert Picture Here> Introduction to Big Data Technology

<Insert Picture Here> Introduction to Big Data Technology Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

More information

Big Data Architect.

Big Data Architect. Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional

More information

Big Data Analytics using Apache Hadoop and Spark with Scala

Big Data Analytics using Apache Hadoop and Spark with Scala Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

Data-Intensive Distributed Computing

Data-Intensive Distributed Computing Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

HDInsight > Hadoop. October 12, 2017

HDInsight > Hadoop. October 12, 2017 HDInsight > Hadoop October 12, 2017 2 Introduction Mark Hudson >20 years mixing technology with data >10 years with CapTech Microsoft Certified IT Professional Business Intelligence Member of the Richmond

More information

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework Scientific Journal of Impact Factor (SJIF): e-issn (O): 2348- International Journal of Advance Engineering and Research Development Volume 3, Issue 2, February -2016 A Study: Hadoop Framework Devateja

More information

Overview of Data Services and Streaming Data Solution with Azure

Overview of Data Services and Streaming Data Solution with Azure Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server

More information

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component

More information

MarkLogic Technology Briefing

MarkLogic Technology Briefing MarkLogic Technology Briefing Edd Patterson CTO/VP Systems Engineering, Americas Slide 1 Agenda Introductions About MarkLogic MarkLogic Server Deep Dive Slide 2 MarkLogic Overview Company Highlights Headquartered

More information

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence SAP HANA Update Saul Cunningham SAP Big Data Centre of Excellence The first 35 years: innovated with ERP & LOB apps Data In ERP + LOB Systems of Record Five years ago: innovated with analytics Data In

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET SOLUTION SHEET Syncsort DMX-h Simplifying Big Data Integration Goals of the Modern Data Architecture Data warehouses and mainframes are mainstays of traditional data architectures and still play a vital

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

Bringing Data to Life

Bringing Data to Life Bringing Data to Life Data management and Visualization Techniques Benika Hall Rob Harrison Corporate Model Risk March 16, 2018 Introduction Benika Hall Analytic Consultant Wells Fargo - Corporate Model

More information

SpagoBI and Talend jointly support Big Data scenarios

SpagoBI and Talend jointly support Big Data scenarios SpagoBI and Talend jointly support Big Data scenarios Monica Franceschini - SpagoBI Architect SpagoBI Competency Center - Engineering Group Big-data Agenda Intro & definitions Layers Talend & SpagoBI SpagoBI

More information

Ian Choy. Technology Solutions Professional

Ian Choy. Technology Solutions Professional Ian Choy Technology Solutions Professional XML KPIs SQL Server 2000 Management Studio Mirroring SQL Server 2005 Compression Policy-Based Mgmt Programmability SQL Server 2008 PowerPivot SharePoint Integration

More information

Customer SAP BW/4HANA. Salvador Gimeno 7 December SAP SE or an SAP affiliate company. All rights reserved. Customer

Customer SAP BW/4HANA. Salvador Gimeno 7 December SAP SE or an SAP affiliate company. All rights reserved. Customer SAP BW/4HANA Customer Salvador Gimeno 7 December 2016 2016 SAP SE or an SAP affiliate company. All rights reserved. Customer 1 DISCLAIMER This presentation is not subject to your license agreement or any

More information

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous

More information

SQL Server 2017 Power your entire data estate from on-premises to cloud

SQL Server 2017 Power your entire data estate from on-premises to cloud SQL Server 2017 Power your entire data estate from on-premises to cloud PREMIER SPONSOR GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS SUPPORTERS Vulnerabilities (2010-2016) Power your entire data estate

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle

Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to

More information

The Technology of the Business Data Lake. Appendix

The Technology of the Business Data Lake. Appendix The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform

More information

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? Simple to start What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? What is the maximum download speed you get? Simple computation

More information

Streaming Integration and Intelligence For Automating Time Sensitive Events

Streaming Integration and Intelligence For Automating Time Sensitive Events Streaming Integration and Intelligence For Automating Time Sensitive Events Ted Fish Director Sales, Midwest ted@striim.com 312-330-4929 Striim Executive Summary Delivering Data for Time Sensitive Processes

More information

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data on AWS Big Data Agility and Performance Delivered in the Cloud 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Technologies and techniques for working productively

More information

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved. Gain Insights From Unstructured Data Using Pivotal HD 1 Traditional Enterprise Analytics Process 2 The Fundamental Paradigm Shift Internet age and exploding data growth Enterprises leverage new data sources

More information

How to Protect SAP HANA Applications with the Data Protection Suite

How to Protect SAP HANA Applications with the Data Protection Suite White Paper Business Continuity How to Protect SAP HANA Applications with the Data Protection Suite As IT managers realize the benefits of in-memory database technology, they are accelerating their plans

More information

DATABASE DESIGN II - 1DL400

DATABASE DESIGN II - 1DL400 DATABASE DESIGN II - 1DL400 Fall 2016 A second course in database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

Chapter 6 VIDEO CASES

Chapter 6 VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Alexander Klein. #SQLSatDenmark. ETL meets Azure

Alexander Klein. #SQLSatDenmark. ETL meets Azure Alexander Klein ETL meets Azure BIG Thanks to SQLSat Denmark sponsors Save the date for exiting upcoming events PASS Camp 2017 Main Camp 05.12. 07.12.2017 (04.12. Kick-Off abends) Lufthansa Training &

More information

MapR Enterprise Hadoop

MapR Enterprise Hadoop 2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. Processing Unstructured Data Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. http://dinesql.com / Dinesh Priyankara @dinesh_priya Founder/Principal Architect dinesql Pvt Ltd. Microsoft Most

More information

Cloud Computing & Visualization

Cloud Computing & Visualization Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International

More information

Hadoop, Yarn and Beyond

Hadoop, Yarn and Beyond Hadoop, Yarn and Beyond 1 B. R A M A M U R T H Y Overview We learned about Hadoop1.x or the core. Just like Java evolved, Java core, Java 1.X, Java 2.. So on, software and systems evolve, naturally.. Lets

More information

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Oracle zsig Conference IBM LinuxONE and z System Servers Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Sam Amsavelu Oracle on z Architect IBM Washington

More information

Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration

Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration WHITE PAPER / JANUARY 25, 2019 Table of Contents Introduction... 3 Harnessing the power of big data beyond the SQL world...

More information

Online Bill Processing System for Public Sectors in Big Data

Online Bill Processing System for Public Sectors in Big Data IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 10 March 2018 ISSN (online): 2349-6010 Online Bill Processing System for Public Sectors in Big Data H. Anwer

More information

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and

More information

Top 25 Big Data Interview Questions And Answers

Top 25 Big Data Interview Questions And Answers Top 25 Big Data Interview Questions And Answers By: Neeru Jain - Big Data The era of big data has just begun. With more companies inclined towards big data to run their operations, the demand for talent

More information

Big Data The end of Data Warehousing?

Big Data The end of Data Warehousing? Big Data The end of Data Warehousing? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Big data, data warehousing, advanced analytics, Hadoop, unstructured data Introduction If there was an Unwort

More information

Solution Brief. Bridging the Infrastructure Gap for Unstructured Data with Object Storage. 89 Fifth Avenue, 7th Floor. New York, NY 10003

Solution Brief. Bridging the Infrastructure Gap for Unstructured Data with Object Storage. 89 Fifth Avenue, 7th Floor. New York, NY 10003 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 Solution Brief Bridging the Infrastructure Gap for Unstructured Data with Object Storage Printed in the United

More information

Strategic Briefing Paper Big Data

Strategic Briefing Paper Big Data Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which

More information

A Robust, Flexible Platform for Expanding Your Storage without Limits

A Robust, Flexible Platform for Expanding Your Storage without Limits White Paper SUSE Enterprise A Robust, Flexible Platform for Expanding Your without Limits White Paper A Robust, Flexible Platform for Expanding Your without Limits Unlimited Scalability That s Cost-Effective

More information

microsoft

microsoft 70-775.microsoft Number: 70-775 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 Note: This question is part of a series of questions that present the same scenario. Each question in the series

More information

Top Five Reasons for Data Warehouse Modernization Philip Russom

Top Five Reasons for Data Warehouse Modernization Philip Russom Top Five Reasons for Data Warehouse Modernization Philip Russom TDWI Research Director for Data Management May 28, 2014 Sponsor Speakers Philip Russom TDWI Research Director, Data Management Steve Sarsfield

More information

Cloud Analytics and Business Intelligence on AWS

Cloud Analytics and Business Intelligence on AWS Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,

More information

Based on Big Data: Hype or Hallelujah? by Elena Baralis

Based on Big Data: Hype or Hallelujah? by Elena Baralis Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of

More information

EMEA USERS CONFERENCE BERLIN, GERMANY. Copyright 2016 OSIsoft, LLC

EMEA USERS CONFERENCE BERLIN, GERMANY. Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Franco Camba, OSIsoft Matt Ziegler, OSIsoft Frank Ruland, SAP Audience Poll Have you invested or are you looking into Business Intelligence tools?

More information