An InterSystems Guide to the Data Galaxy. Benjamin De Boe Product Manager
|
|
- Harry Tate
- 5 years ago
- Views:
Transcription
1 An InterSystems Guide to the Data Galaxy Benjamin De Boe Product Manager
2 Analytics
3 3 InterSystems Corporation. All rights reserved.
4 4 InterSystems Corporation. All rights reserved.
5 5 InterSystems Corporation. All rights reserved.
6 6 InterSystems Corporation. All rights reserved.
7 7 InterSystems Corporation. All rights reserved.
8 Analytical Workflows & Tools An InterSystems Guide to the Data Galaxy
9 Capturing & Storing Data raw data Capturing & Storing curated data 9 InterSystems Corporation. All rights reserved.
10 Capturing & Storing Data Goals: Persist data in a form that maximizes options for analytics Tools: Storage: Hadoop, HDFS, NoSQL DBMS Database: RDBMS, Multi-model DBMS Solutions: Splunk, Flume Challenges: Volume, Velocity and Variety of data Provide flexible & efficient access to data 10 InterSystems Corporation. All rights reserved.
11 Data Processing Processing raw data Capturing & Storing curated data 11 InterSystems Corporation. All rights reserved.
12 Data Processing Goals: Transform raw data into analytics-ready formats Tools: Frameworks: Hadoop MapReduce, Spark, Flink, ETL tools: ODI, Attunity, Talend, Informatica, SnapLogic, Databases: SQL, Stored procedures, Challenges: Efficiency & scalability as volumes grow Trade-off between flexibility & usability 12 InterSystems Corporation. All rights reserved.
13 Exploration & Analytics Business Analytics Exploring Processing raw data Capturing & Storing curated data 13 InterSystems Corporation. All rights reserved.
14 Exploration & Analytics Goals: Exploratory analysis of raw data s general characteristics & distribution Deeper analysis through predefined dimensions, measures & KPIs Identify opportunities for data refinement & further analysis Tools: Exploration: DBMS, Hive, SparkSQL, Periscope, Analytics: Tableau, QlikView, Birst, Datameer, Oracle BI, Cognos, BusinessObjects, Challenges: Trade-off between flexibility & ease of use through predefined data models Support for semi-structured and structured data Efficiency & scalability as volumes grow 14 InterSystems Corporation. All rights reserved.
15 Machine Learning Business Analytics Machine Learning Exploring Processing raw data Capturing & Storing curated data 15 InterSystems Corporation. All rights reserved.
16 Machine Learning Goals: Identify correlations between input features and outcomes Define derived features and models that maximize predictive value Tools: Frameworks: R, Spark ML, System ML, SciKit, Products: SAS, SPSS, BigML, KNIME, Challenges: Abstract complexity of underlying algorithms Optimize complex computations on large datasets 16 InterSystems Corporation. All rights reserved.
17 Closing the Loop Predicting Machine Learning Business Analytics Exploring Processing raw data Capturing & Storing curated data 17 InterSystems Corporation. All rights reserved.
18 Closing the Loop Goals: Apply predictive models to new data Leverage analytics output in transactional systems Tools: Standard: PMML Tools: Zementis, SAS Challenges: Efficient model scoring in transactional systems Avoid complexity and heterogeneity in production environment 18 InterSystems Corporation. All rights reserved.
19 An Open Analytics Platform An InterSystems Guide to the Data Galaxy
20 Recurring Themes Diverse tasks require diverse toolsets: Open Platform There is no single tool that covers the entire spectrum A proper combination of tools can help address the usability vs flexibility trade-off Importance of a flexible storage layer: Flexible Platform Tool diversity should not be an excuse for siloing Move computations closer to the data to increase efficiency Horizontal scalability is a must: Almost every task lists volume-related challenges Scalable Platform 20 InterSystems Corporation. All rights reserved.
21
22 InterSystems IRIS Analytics Strategy An Open Analytics Platform allows partners and end-users to build solutions and services that leverage best-of-breed analytics technologies, both embedded in the platform and complementary to it. third-party tools embedded technologies industry standards MDX UIMA Apache Spark BI Text Analytics PMML UIMA Connector Connector JDBC Integration Integration InterSystems IRIS: Database
23 InterSystems IRIS: Database The InterSystems IRIS Database sits at the core of our Open Analytics Platform. It scales both up and out to meet the most demanding workloads Parallel processing for vertical scalability ECP application servers and sharding for horizontal scalability For SQL access, the InterSystems JDBC driver offers many optimizations to maximize throughput, all fully transparent to analytical tools connecting to the database: Leverage distributed data layout for ingestion Exploit low-level storage format for select queries Shared memory access 23 InterSystems Corporation. All rights reserved.
24 InterSystems IRIS: Analytics InterSystems IRIS: Analytics comes with embedded technologies for exploring and analysing both structured and unstructured data. Being embedded in the platform, they can be easily leveraged in applications and solutions built on InterSystems IRIS. InterSystems IRIS: Business Intelligence (fka DeepSee) Define and query multidimensional OLAP cubes Supports both ad-hoc analysis and embeddable dashboards InterSystems IRIS: Text Analytics (fka iknow) Identify concepts and their context from text through a unique bottom-up approach Enable knowledge professionals to explore and consume large volumes of texts 24 InterSystems Corporation. All rights reserved.
25 InterSystems IRIS: Apache Spark Connector Apache Spark addresses the need for efficient, horizontally scalable data processing through its Resilient Distributed Dataset abstraction Much higher developer productivity than Hadoop s MapReduce framework Lazy execution & optimized code generation enables very high throughput Comes with broad set of libraries, including Machine Learning, Graph, etc Active and broad user community Apache Spark s Data Source API allows database vendors to implement optimizations that leverage the underlying data store Spark s Catalyst compiler will use these to push computations to the data Further convenience extensions are also available 25 InterSystems Corporation. All rights reserved.
26 InterSystems IRIS: Apache Spark Connector InterSystems IRIS Apache Spark Connector capabilities: Convenient Read/Write API Expose DB table as Dataset Expose arbitrary SQL query as Dataset Pushdown strategies filter(), select(), and more advanced where possible Leverage distributed DB data layout & parallelism Biggest gains vs vanilla JDBC connector 26 InterSystems Corporation. All rights reserved.
27 JDBC InterSystems IRIS Spark Connector InterSystems IRIS Spark Connector InterSystems IRIS Spark Connector InterSystems IRIS: Apache Spark Connector Spark Spark Spark Spark Spark Spark Master Master Shard Shard Shard Shard Shard Shard 27 InterSystems Corporation. All rights reserved.
28 InterSystems IRIS: PMML Support Transactional System Advanced Analytics Model Consumer Model Producer 28 InterSystems Corporation. All rights reserved.
29 InterSystems IRIS: PMML Support Transactional System PMML Advanced Analytics Model Consumer Model Producer 30 InterSystems Corporation. All rights reserved.
30 InterSystems IRIS: PMML Support PMML standard enables best-of-breed approach for Predictive Analytics Leverage existing skillset in 3rd party tools for model building in lab environment Leverage InterSystems IRIS Data Platform for model execution in production environment Fits our Open Analytics Platform roadmap Standard support in InterSystems IRIS: PMML Consumer Automatically generate optimized COS code based on PMML Leverage from other InterSystems IRIS embedded technologies, InterSystems Corporation. All rights reserved.
31 InterSystems IRIS: UIMA Integration Combining different NLP tools is challenging at best custom NLP tech 1 custom NLP tech 2 custom 34 InterSystems Corporation. All rights reserved.
32 InterSystems IRIS: UIMA Integration UIMA standard offers interoperability & scalability for NLP application developers NLP tech 1 NLP tech 2 UIMA NLP tech 1 UIMA NLP tech 2 UIMA NLP tech 1 NLP tech 2 35 InterSystems Corporation. All rights reserved.
33 InterSystems IRIS: UIMA Integration InterSystems IRIS UIMA Integration marries data platform and UIMA data processing infrastructure for increased developer productivity and a quicker path to analytics NLP tech 1 NLP tech 2 NLP tech 1 UIMA NLP tech 2 NLP tech 1 ISC NLP Source Text ~ Annotations InterSystems IRIS Data Platform
34 Conclusion An InterSystems Guide to the Data Galaxy
35 Conclusions The InterSystems IRIS Data Platform is an Open Analytics Platform, helping customers leverage the tools they know on a platform that scales: Through embedded technologies for exploration & analytics Through connectors for best-of-breed third-party technologies, allowing those tools to leverage our platform in a way that optimizes overall system performance Through integration with relevant industry standards, enabling customers to leverage models and components created in dedicated tools as part of new solutions 38 InterSystems Corporation. All rights reserved.
36 Related Sessions InterSystems IRIS: Database What s Lurking in your Data Lake? (Tue, 3PM) We Want More, Solving Scalability (Tue, 4PM; Wed, 11AM) Speeding up JDBC (Wed, 10:30AM, flash talk) InterSystems IRIS: Spark Connector Apache Spark Experience lab (runs daily) InterSystems IRIS: Text Analytics iknow: Treating Patients with iknow and REST (Mon, 3:30PM) ifind: Catching Bad Guys with ifind and REST (Tue, 4PM) UIMA: Annotations on the Sofa (Wed, 9:45AM, flash talk) 39 InterSystems Corporation. All rights reserved.
37 Please complete the survey in the event app. Session recording and slides will be available at: Search for Global Summit 2017 Visit the Tech Exchange to learn more!
38 Thank you.
Oracle Big Data Science IOUG Collaborate 16
Oracle Big Data Science IOUG Collaborate 16 Session 4762 Tim and Dan Vlamis Tuesday, April 12, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed 200+ Oracle
More informationAchieving Horizontal Scalability. Alain Houf Sales Engineer
Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches
More informationCONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM
CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications
More informationIntroduction to Big-Data
Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,
More informationOracle Big Data Science
Oracle Big Data Science Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com @VlamisSoftware Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri
More informationThe Technology of the Business Data Lake. Appendix
The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationAbstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight
ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationMassive Scalability With InterSystems IRIS Data Platform
Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationDrawing the Big Picture
Drawing the Big Picture Multi-Platform Data Architectures, Queries, and Analytics Philip Russom TDWI Research Director for Data Management August 26, 2015 Sponsor 2 Speakers Philip Russom TDWI Research
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationData contains value and knowledge
Data contains value and knowledge What is the purpose of big data systems? To support analysis and knowledge discovery from very large amounts of data But to extract the knowledge data needs to be Stored
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More informationOptimizing and Modeling SAP Business Analytics for SAP HANA. Iver van de Zand, Business Analytics
Optimizing and Modeling SAP Business Analytics for SAP HANA Iver van de Zand, Business Analytics Early data warehouse projects LIMITATIONS ISSUES RAISED Data driven by acquisition, not architecture Too
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationBlended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)
Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationR Language for the SQL Server DBA
R Language for the SQL Server DBA Beginning with R Ing. Eduardo Castro, PhD, Principal Data Analyst Architect, LP Consulting Moderated By: Jose Rolando Guay Paz Thank You microsoft.com idera.com attunity.com
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationActivator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.
Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationThe age of Big Data Big Data for Oracle Database Professionals
The age of Big Data Big Data for Oracle Database Professionals Oracle OpenWorld 2017 #OOW17 SessionID: SUN5698 Tom S. Reddy tom.reddy@datareddy.com About the Speaker COLLABORATE & OpenWorld Speaker IOUG
More informationApproaching the Petabyte Analytic Database: What I learned
Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may
More informationShen PingCAP 2017
Shen Li @ PingCAP About me Shen Li ( 申砾 ) Tech Lead of TiDB, VP of Engineering Netease / 360 / PingCAP Infrastructure software engineer WHY DO WE NEED A NEW DATABASE? Brief History Standalone RDBMS NoSQL
More informationOracle Big Data Fundamentals Ed 2
Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationSaving ETL Costs Through Data Virtualization Across The Enterprise
Saving ETL Costs Through Virtualization Across The Enterprise IBM Virtualization Manager for z/os Marcos Caurim z Analytics Technical Sales Specialist 2017 IBM Corporation What is Wrong with Status Quo?
More informationAccelerating Digital Transformation with InterSystems IRIS and vsan
HCI2501BU Accelerating Digital Transformation with InterSystems IRIS and vsan Murray Oldfield, InterSystems Andreas Dieckow, InterSystems Christian Rauber, VMware #vmworld #HCI2501BU Disclaimer This presentation
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationUNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX
UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their
More informationBuilding an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle
Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to
More informationBig Data with Hadoop Ecosystem
Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process
More informationBring Context To Your Machine Data With Hadoop, RDBMS & Splunk
Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationInnovatus Technologies
HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationUSERS CONFERENCE Copyright 2016 OSIsoft, LLC
Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationBig Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours
Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals
More informationInformatica Enterprise Information Catalog
Data Sheet Informatica Enterprise Information Catalog Benefits Automatically catalog and classify all types of data across the enterprise using an AI-powered catalog Identify domains and entities with
More informationIT directors, CIO s, IT Managers, BI Managers, data warehousing professionals, data scientists, enterprise architects, data architects
Organised by: www.unicom.co.uk OVERVIEW This two day workshop is aimed at getting Data Scientists, Data Warehousing and BI professionals up to scratch on Big Data, Hadoop, other NoSQL DBMSs and Multi-Platform
More informationImplementing and Maintaining Microsoft SQL Server 2008 Analysis Services
Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course Details Course Outline Module 1: Introduction to Microsoft SQL Server Analysis Services This module introduces
More informationOracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA
Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Keywords: Big Data, Oracle Big Data Appliance, Hadoop, NoSQL, Oracle
More informationOverview of Data Services and Streaming Data Solution with Azure
Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationApache Kylin. OLAP on Hadoop
Apache Kylin OLAP on Hadoop Agenda What s Apache Kylin? Tech Highlights Performance Roadmap Q & A http://kylin.io What s Kylin kylin / ˈkiːˈlɪn / 麒麟 --n. (in Chinese art) a mythical animal of composite
More informationAgenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache
Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,
More informationTurning Relational Database Tables into Spark Data Sources
Turning Relational Database Tables into Spark Data Sources Kuassi Mensah Jean de Lavarene Director Product Mgmt Director Development Server Technologies October 04, 2017 3 Safe Harbor Statement The following
More informationCERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)
CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program
More informationAfter completing this course, participants will be able to:
Designing a Business Intelligence Solution by Using Microsoft SQL Server 2008 T h i s f i v e - d a y i n s t r u c t o r - l e d c o u r s e p r o v i d e s i n - d e p t h k n o w l e d g e o n d e s
More informationBig Data Hadoop Stack
Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware
More informationTaming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems
1 Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems The Defacto Choice For Convergence 2 ABSTRACT & SPEAKER BIO Dealing with enormous data growth is a key challenge for
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationCapture Business Opportunities from Systems of Record and Systems of Innovation
Capture Business Opportunities from Systems of Record and Systems of Innovation Amit Satoor, SAP March Hartz, SAP PUBLIC Big Data transformation powers digital innovation system Relevant nuggets of information
More informationFrom Insight to Action: Analytics from Both Sides of the Brain. Vaz Balasingham Director of Solutions Consulting
From Insight to Action: Analytics from Both Sides of the Brain Vaz Balasingham Director of Solutions Consulting vbalasin@tibco.com Insight to Action from Both Sides of the Brain Value Grow Revenue Reduce
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationVirtuoso Infotech Pvt. Ltd.
Virtuoso Infotech Pvt. Ltd. About Virtuoso Infotech Fastest growing IT firm; Offers the flexibility of a small firm and robustness of over 30 years experience collectively within the leadership team Technology
More informationTake P, R or U. and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22
Take P, R or U and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22 Oliver Engels CEO, oh22data AG @oengels Datamonster from Germany MS Data Platform MVP President of PASS Germany
More informationTESTING BIG DATA WORLD RIGA. by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY
RIGA TESTING BIG DATA WORLD by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY BIG DATA IS NOT ABOUT THE DATA THE REVOLUTION IS NOT THAT THERE S MORE DATA AVAILABLE THE REVOLUTION IS THAT WE
More informationBig Data on AWS. Peter-Mark Verwoerd Solutions Architect
Big Data on AWS Peter-Mark Verwoerd Solutions Architect What to get out of this talk Non-technical: Big Data processing stages: ingest, store, process, visualize Hot vs. Cold data Low latency processing
More informationCapability White Paper Straight-Through-Processing (STP)
Capability White Paper Straight-Through-Processing (STP) Drag-and-drop to create automated, repeatable, flexible and powerful data flow and application logic orchestration without programming to support
More informationSOLUTION TRACK Finding the Needle in a Big Data Innovator & Problem Solver Cloudera
SOLUTION TRACK Finding the Needle in a Big Data Haystack @EvaAndreasson, Innovator & Problem Solver Cloudera Agenda Problem (Solving) Apache Solr + Apache Hadoop et al Real-world examples Q&A Problem Solving
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationDatabricks, an Introduction
Databricks, an Introduction Chuck Connell, Insight Digital Innovation Insight Presentation Speaker Bio Senior Data Architect at Insight Digital Innovation Focus on Azure big data services HDInsight/Hadoop,
More informationMAPR DATA GOVERNANCE WITHOUT COMPROMISE
MAPR TECHNOLOGIES, INC. WHITE PAPER JANUARY 2018 MAPR DATA GOVERNANCE TABLE OF CONTENTS EXECUTIVE SUMMARY 3 BACKGROUND 4 MAPR DATA GOVERNANCE 5 CONCLUSION 7 EXECUTIVE SUMMARY The MapR DataOps Governance
More informationData Lake Based Systems that Work
Data Lake Based Systems that Work There are many article and blogs about what works and what does not work when trying to build out a data lake and reporting system. At DesignMind, we have developed a
More informationJAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.
JAVASCRIPT CHARTING Scaling for the Enterprise with Metric Insights 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING... 3! Challenges... 3! Borrowing From The Enterprise BI Stack... 4! Visualization
More informationProgress DataDirect For Business Intelligence And Analytics Vendors
Progress DataDirect For Business Intelligence And Analytics Vendors DATA SHEET FEATURES: Direction connection to a variety of SaaS and on-premises data sources via Progress DataDirect Hybrid Data Pipeline
More informationData 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.
17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations
More informationData Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20
Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,
More informationTHINK DIGITAL RETHINK LEGACY
THINK DIGITAL RETHINK LEGACY Adabas & 2050+ Platform Strategy & Roadmap Bruce Beddoe VP Adabas Systems 1 % BUSINESS & MISSION-CRITICAL 2 For internal use only Billions invested in DIFFERENTIATING business
More informationAccelerating BI on Hadoop: Full-Scan, Cubes or Indexes?
White Paper Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? How to Accelerate BI on Hadoop: Cubes or Indexes? Why not both? 1 +1(844)384-3844 INFO@JETHRO.IO Overview Organizations are storing more
More information@Pentaho #BigDataWebSeries
Enterprise Data Warehouse Optimization with Hadoop Big Data @Pentaho #BigDataWebSeries Your Hosts Today Dave Henry SVP Enterprise Solutions Davy Nys VP EMEA & APAC 2 Source/copyright: The Human Face of
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationThe Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou
The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component
More informationData in the Cloud and Analytics in the Lake
Data in the Cloud and Analytics in the Lake Introduction Working in Analytics for over 5 years Part the digital team at BNZ for 3 years Based in the Auckland office Preferred Languages SQL Python (PySpark)
More informationHadoop course content
course content COURSE DETAILS 1. In-detail explanation on the concepts of HDFS & MapReduce frameworks 2. What is 2.X Architecture & How to set up Cluster 3. How to write complex MapReduce Programs 4. In-detail
More informationUnderstanding the latent value in all content
Understanding the latent value in all content John F. Kennedy (JFK) November 22, 1963 INGEST ENRICH EXPLORE Cognitive skills Data in any format, any Azure store Search Annotations Data Cloud Intelligence
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationTeradata Aggregate Designer
Data Warehousing Teradata Aggregate Designer By: Sam Tawfik Product Marketing Manager Teradata Corporation Table of Contents Executive Summary 2 Introduction 3 Problem Statement 3 Implications of MOLAP
More informationETL is No Longer King, Long Live SDD
ETL is No Longer King, Long Live SDD How to Close the Loop from Discovery to Information () to Insights (Analytics) to Outcomes (Business Processes) A presentation by Brian McCalley of DXC Technology,
More informationSecurity and Performance advances with Oracle Big Data SQL
Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,
More informationBig Data The end of Data Warehousing?
Big Data The end of Data Warehousing? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Big data, data warehousing, advanced analytics, Hadoop, unstructured data Introduction If there was an Unwort
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationShine a Light on Dark Data with Vertica Flex Tables
White Paper Analytics and Big Data Shine a Light on Dark Data with Vertica Flex Tables Hidden within the dark recesses of your enterprise lurks dark data, information that exists but is forgotten, unused,
More informationOracle Big Data Discovery
Oracle Big Data Discovery Turning Data into Business Value Harald Erb Oracle Business Analytics & Big Data 1 Safe Harbor Statement The following is intended to outline our general product direction. It
More informationImplementing and Maintaining Microsoft SQL Server 2005 Analysis Services
Implementing and Maintaining Microsoft SQL Server 2005 Analysis Services Introduction Elements of this syllabus are subject to change. This three-day instructor-led course teaches students how to implement
More informationWhat is Gluent? The Gluent Data Platform
What is Gluent? The Gluent Data Platform The Gluent Data Platform provides a transparent data virtualization layer between traditional databases and modern data storage platforms, such as Hadoop, in the
More informationKNIME for the life sciences Cambridge Meetup
KNIME for the life sciences Cambridge Meetup Greg Landrum, Ph.D. KNIME.com AG 12 July 2016 What is KNIME? A bit of motivation: tool blending, data blending, documentation, automation, reproducibility More
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More information