SpagoBI and Talend jointly support Big Data scenarios

Similar documents
Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

Big Data Architect.

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Big Data with Hadoop Ecosystem

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

1Z Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions

Innovatus Technologies

HDInsight > Hadoop. October 12, 2017

Microsoft Big Data and Hadoop

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

The age of Big Data Big Data for Oracle Database Professionals

Ian Choy. Technology Solutions Professional

Hadoop course content

Hadoop & Big Data Analytics Complete Practical & Real-time Training

New Approaches to Big Data Processing and Analytics

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

IT directors, CIO s, IT Managers, BI Managers, data warehousing professionals, data scientists, enterprise architects, data architects

Spotfire Advanced Data Services. Lunch & Learn Tuesday, 21 November 2017

Hadoop. Introduction / Overview

Hadoop An Overview. - Socrates CCDH

Talend Open Studio for Big Data. Getting Started Guide 5.3.2

Oracle GoldenGate for Big Data

Big Data Analytics using Apache Hadoop and Spark with Scala

Stages of Data Processing

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Modern ETL Tools for Cloud and Big Data. Ken Beutler, Principal Product Manager, Progress Michael Rainey, Technical Advisor, Gluent Inc.

This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.

Composite Software Data Virtualization The Five Most Popular Uses of Data Virtualization

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

Oracle Big Data Fundamentals Ed 2

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Data Lake Based Systems that Work

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

<Insert Picture Here> Introduction to Big Data Technology

Big Data Hadoop Stack

Talend Open Studio for Big Data. Getting Started Guide 5.4.0

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

Talend Open Studio for Big Data. Getting Started Guide 5.4.2

Hortonworks and The Internet of Things

FEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market

Configuring and Deploying Hadoop Cluster Deployment Templates

Introduction to Big-Data

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Talend Open Studio for Big Data. Installation and Upgrade Guide 5.3.1

Oracle Big Data Connectors

Why Quality Depends on Big Data

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

Real-time Calculating Over Self-Health Data Using Storm Jiangyong Cai1, a, Zhengping Jin2, b

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

The TIBCO Insight Platform 1. Data on Fire 2. Data to Action. Michael O Connell Catalina Herrera Peter Shaw September 7, 2016

R Language for the SQL Server DBA

Introduction to BigData, Hadoop:-

Hadoop Overview. Lars George Director EMEA Services

MapR Enterprise Hadoop

Big Data The end of Data Warehousing?

Certified Big Data and Hadoop Course Curriculum

New Technologies for Data Management

Big Data Hadoop Course Content

Oracle 1Z Oracle Big Data 2017 Implementation Essentials.

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Information empowerment for your evolving data ecosystem

Oracle Big Data Science

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Big Data Analytics. Description:

TESTING BIG DATA WORLD RIGA. by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

BIG DATA ANALYTICS A PRACTICAL GUIDE

Modern Data Warehouse The New Approach to Azure BI

Hadoop, Yarn and Beyond

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

STATE OF MODERN APPLICATIONS IN THE CLOUD

Spatial Analytics Built for Big Data Platforms

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers

Interactive SQL-on-Hadoop from Impala to Hive/Tez to Spark SQL to JethroData

Acquiring Big Data to Realize Business Value

Based on Big Data: Hype or Hallelujah? by Elena Baralis

BIG DATA COURSE CONTENT

Introduction to Hadoop and MapReduce

Oracle Big Data Fundamentals Ed 1

Department of Information Technology, St. Joseph s College (Autonomous), Trichy, TamilNadu, India

Data Science and Open Source Software. Iraklis Varlamis Assistant Professor Harokopio University of Athens

SOLUTION BRIEF BIG DATA SECURITY

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA

Achieve Data Democratization with effective Data Integration Saurabh K. Gupta

What does SAS Data Management do? For whom is SAS Data Management designed? Key Benefits

Indiana Oracle Users Group January meeting January 27 th, Big Data

Talend Big Data Sandbox. Big Data Insights Cookbook

Eight Essential Checklists for Managing the Analytic Data Pipeline

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

The Technology of the Business Data Lake. Appendix

Introduction into Big Data analytics Lecture 2 Big data platforms. Janusz Szwabiński

Logging Reservoir Evaluation Based on Spark. Meng-xin SONG*, Hong-ping MIAO and Yao SUN

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Hortonworks Data Platform

Accelerate Your Data Pipeline for Data Lake, Streaming and Cloud Architectures

Transcription:

SpagoBI and Talend jointly support Big Data scenarios Monica Franceschini - SpagoBI Architect SpagoBI Competency Center - Engineering Group

Big-data Agenda Intro & definitions Layers Talend & SpagoBI SpagoBI big-data roadmap

Big Data - 3Vs "Big data" is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. Source: The Importance of 'Big Data': A Definition, Mark Beyer, Douglas. Gartner, 21 June 2012. VOLUME The increase in data volumes within enterprise systems is caused by transaction volumes and other traditional data types, as well as by new types of data. Too much volume is a storage issue, but too much data is also a massive analysis issue VARIETY IT leaders have always had an issue translating large volumes of transactional information into decisions now there are more types of information to analyze mainly coming from social media and mobile (context-aware). Variety includes tabular data (databases), hierarchical data, documents, e-mail, metering data, video, still images, audio, stock ticker data, financial transactions and more. VELOCITY This involves streams of data, structured record creation, and availability for access and delivery. Velocity means both how fast data is being produced and how fast the data must be processed to meet demand Gartner Press Release, Gartner Says Solving Big Data Challenge Involves More Than Just Managing Volumes of Data, June 27, 2011

Big Data- 3Vs & more VARIABILITY variance in meaning, in lexicon VERACITY 1 in 3 business leaders don t trust the information they use to make decisions. How can you act upon information if you don t trust it? Establishing trust in big data presents a huge challenge as the variety and number of sources grows. VALUE The economic value of different data varies significantly. Typically there is good information hidden amongst a larger body of nontraditional data; the challenge is identifying what is valuable and then transforming and extracting that data for analysis.

Big data - Layers Infastructure On-site IaaS Data management: capture cleaning loading store ETL View and Analyse Text analysis Text mining exploration, navigation, presentation Business Intelligence Application Cloud SaaA Services

Big data & Businessn Intelligence Tasks: Manage big-data (ETL) Talend Read, interpret and show big-data (BI) SpagoBI Big-data and real-time (BI) SpagoBI

Talend - Big Data Management Big Data Production RDBMS Analytical DB NoSQL DB ERP/CRM SaaS Social Media Web Analytics Log Files RFID Call Data Records Sensors Machine-Generated Big Data Management Big Data Integration Big Data Quality Big Data Consumption Mining Analytics Storage Processing Filtering Parsing Checking Search Enrichment Turn Big Data into actionable information

Talend Goal: democratize Big Data Talend Open Studio for Big Data Big Data for the Masses Improves efficiency of big data job design with graphic interface Abstracts and generates code HCatalog Run transforms inside Hadoop Native support for HDFS, Sqoop, HBase, Mahout, Pig, Hive & MapReduce code generat an open source ecosystem Apache License 2.0 Embedded in Hortonworks Data Platform Certifed with Cloudera, MapR and Grenplum

ETL: Analytical databases & appliances Connectors from/to: Greenplum Netezza Sybase Teradata VectorWise Vertica HDFS HBase Hive Cassandra MongoDB

SpagoBI - load Certified appliances: Teradata VectorWise Connectors from: Cassandra HBase Hive Impala Hadoop RT with: Storm WSO2 More: Scheduled data-set In-memory data set

SpagoBI - meaning Connectors from: Neo4J Freebase OrientDB Support for open standards: RDF (Resource Description Framework) http://www.w3.org/rdf/ OWL (Web Ontology Language) http://www.w3.org/owl/ R Mahout Text mining

SpagoBI - show Explorative front-end Network analysis Exploration In-memory Data visualization

SpagoBI - roadmap Capture / Store Talend, connector to/from: Greenplum Netezza Sybase Teradata VectorWise Vertica HDFS HBase Hive Cassandra MongoDB LOAD Certified appliances: Teradata VectorWise Connectors from: Cassandra HBase Hive Impala Hadoop MongoDB RT with: Storm WS02 More: Scheduled data-set In-memory data set

SpagoBI - roadmap Meaning Show Connectors from: Explorative front-end Network analysis Data visualization Neo4J Freebase OrientDB Support for open standards: RDF OWL Services Mining R MashR Text mining Big data as a service Multitenant Cloud BI as a service (ad-hoc+self-service) Data scientist

Bundle Talend -SpagoBI SpagoBI and Talend announce their bundle! The bundle will provide: a distribution of both tools interacting one with each other a use-case that can be run to explore their functionalities

Monica.franceschini@eng.it @twittmonique