Lily 2.4 What s New Product Release Notes
|
|
- William Bruce
- 5 years ago
- Views:
Transcription
1 Lily 2.4 What s New Product Release Notes
2 WHAT S NEW IN LILY Table of Contents Table of Contents... 2 Purpose and Overview of this Document... 3 Product Overview... 4 General... 5 Prerequisites... 5 The Lily Data Repository... 6 Side Effect Processor... 6 Multi-Repository Support... 6 Metadata... 6 Record Type Inheritance... 5 HDFS Name Node Support... 8 The Lily Customer Database... 9 Foundations... 9 Master Records... 9 Hive Integration... 9 Interaction Summary Attribute Calculation Engine Lily Customer Database Explorer User Interface Item Explorer Lily Customer Intelligence Applications Rules - Based Recommendation Strategies Knowledge Based Recommendations Provisioning... 15
3 WHAT S NEW IN LILY Purpose and Overview of this Document This document serves as a comprehensive, Customer-oriented overview of all feature additions, changes and enhancements available in Lily Release 2.4. Lily 2.4 has been released the 22nd of July 2013 with the main release focus being updates to the Lily Customer Database and the Lily Data Repository.
4 WHAT S NEW IN LILY Product Overview Lily, the Customer Intelligence Platform, Release 2.4 is composed of the following products: Lily Data Repository (DR) Lily Customer Database (CDB) Lily Customer Intelligence Applications (CA) Lily Enterprise tools: o Cluster installation o ETL tool connectivity o Hive connectivity The Lily Customer Database relies upon the Data Repository, and the Customer Applications on the Customer Database.
5 WHAT S NEW IN LILY General Prerequisites Lily 2.4 has been certified against Cloudera s Distribution Including Apache Hadoop (CDH4) and therefore requires the following: CDH4.2 o Supported Operating Systems: docs/cdh4/4.2.0/cdh4-requirements-and-supported- Versions/cdhrsv_topic_1.html o Supported Databases: content/cloudera-docs/cdh4/4.2.0/cdh4-requirements-and- Supported-Versions/cdhrsv_topic_2.html o Support JDK: content/cloudera-docs/cdh4/4.2.0/cdh4-requirements-and- Supported-Versions/cdhrsv_topic_3.html
6 WHAT S NEW IN LILY The Lily Data Repository With Release 2.4 of the Lily Data Repository a number of key enhancements and expansion of functionality has been included. By delivering these enhancements NGDATA has strengthened and expanded the foundation of the Lily Data Repository, for all users and continuing to provide the best-in-breed solution for all clients. Side Effect Processor A new component, the Side Effect Processor (SEP) has been developed for the Data Repository allowing for greater processing efficiency of asynchronous update operations The SEP sits at the core of Lily s triggering and indexing mechanism and replaces parts of the existing RowLog engine. With this enhancement, Lily is three times faster with indexing switched off, and five times faster when using indexing. Also, maintenance of multiple indexes at the same time bears no additional performance impact. The SEP is also being used for computing aggregates and maintaining additional data indexes in the Customer Database. Multi-Repository Support With this release, records retained within the Lily Data Repository can be stored across multiple Lily Repositories within HBase, thereby supporting the logical segregation of data providing additional performance gains and data security. It is possible to host multiple data repositories in one shared installation, e.g. to operate a shared Lily for multiple departments. Metadata Lily has been enhanced with this release to support field level metadata. This metadata may be comprised of key / value pairs where key is a string and a value is a simple type, such as String, Integer, Long, Float, Double, Boolean, or byte. Providing this functionality to the Lily Data Repository allows Users to identify information such as Source, Creator, Timestamp, Quality, etc. without having to predefine this field in the schema and therefore making the addition of this data by an application simpler. Amongst others, field metadata may be utilized to store access control information.
7 WHAT S NEW IN LILY Record Type Inheritance When creating a schema definition the Lily Data Repository functionality now supports the concept of Supertypes. This allows a User to define a base record type and then Inherit the structure of that base type into another record type that then extends it. As an example: recordtypes: [ { name: person, fields : [ { name: name, mandatory: true }, { name: address, mandatory: true} ] }, { name: Customer, fields: [ { name: Customer Number, mandatory: true } ], supertypes: [ { name: person } ] }, { name: employee, fields: [ { name: employeenumber, mandatory: true } ], supertypes: [ { name: person } ] } ]
8 WHAT S NEW IN LILY In this example, the Customer record will have fields Name, Address, and Customer Number while the Employee record will have Name, Address, and Employee Number. By utilizing the recordtype inheritance, schema design has become much more elegant. HDFS Name Node Support Lily 2.4 Enterprise ships with deployment configuration and instructions to support high-availability of the Hadoop HDFS Name Node service. This allows enterprise clients to run Hadoop in a fault-tolerant configuration and be able to maintain system uptime in the event of server failure.
9 WHAT S NEW IN LILY The Lily Customer Database Foundations At the foundation of the Lily Customer Database, with this release the two enhancements are being delivered: Interaction Timestamps o Lily will add a Timestamp to all Interactions, when the Interaction Timestamp has not been specified on the input data, thereby ensuring the Timestamp field will always contain a value Automatic Creation of Item or Customer o Lily will add a new Item and/or Customer record when an Interaction is logged for a non-existent Item and/or Customer ensuring 100% of the Interactions are prepared for further analysis Master Records An enhancement to the Lily Customer Database allows Users to specify that Customer records from multiple sources are actually for the same Customer. When this occurs, a Master Customer record is created to combine all of the data from the different sources. Each Source Record is kept intact for auditing purposes, but a Master Record is updated to reflect a combination of the most recent data. This is beneficial because it allows Lily Users to get a single view of their Customers in a central place based on data from different systems. Hive Integration Expanding upon the existing Lily Hive integration, a number of enhancements and added features have been developed and included in Release 2.4, which are highlighted below: Generate SQL like Queries called HiveQL for searching data in the Customer Database through a wizard on the Lily User Interface o This User Interface allows a User to generate a Hive query that which can be run offline o This allows the query to be copied into and utilized inside popular BI or data exploration tools supporting Hive such as Tableau, Toad4Cloud,
10 WHAT S NEW IN LILY Search data in Lily via Hive using field level metadata Store data set statistics in the HDFS layer of Lily By expanding the Hive integration NGDATA continues to improve the ability of Users to do analysis on the data that is stored in Lily. Interaction Summary Attribute Calculation Engine The Lily Customer Database has been updated with an engine for both real-time & batch calculation of Summary Attributes that can be defined by the Client, such as: amount of visits, average amount spent, time on website, etc. Lily supports the following basic aggregators: Max, Min, Average, Count, Distinct Count, Last, Sum. Summary Attributes are calculated values, which are based on the analysis of incoming/processed interactions. Summaries Attributes are always stored on a single Customer or Item record, which is comparable to an SQL query with an Aggregation (e.g. SUM) and a GROUP BY Customer. The real-time functionality of this Variable Calculation Engine has been developed as a SEP component to compute the Summary Attributes in real-time during ingestion of the data. The calculated values can also be used in the Lily Customer Database Explorer for Facet Search, allowing for a richer User experience in the User Interface and a continued expansion of the functionality of Lily as a data exploration tool. The usage of the Summary Attributes within the Faceted Search functionality can be enabled/disabled on a Summary Attribute level. In addition to the real-time functionality, Lily also supports a batch-based rebuild of all variables using the parallel processing power of MapReduce. Lily Customer Database Explorer The Lily Customer Database Explorer has been enhanced with the functionality to support facet-based searches using graph widgets (bar chart, pie chart ). This type of User Interface navigation provides the User visual insights into the Lily Customer Database, allowing a step-by-step creation of Customer or Item target views. The Lily Customer Database Explorer also makes it possible to select one or more Segments to aid in segment analysis by utilizing interactive usage & selection within the of the graph-based User Interface widgets.
11 WHAT S NEW IN LILY The graph-based widgets utilize faceted data as input, provided by the Lily Customer Database and support the following types of facets: Aggregated Facets: These type of facets make use of interaction summary attributes, like shopping frequency, click count, Field Facets: These type of facets are not using summarized data but basic field values like gender, region, User Interface With the delivery of Release 2.4, the Customer Database Explorer includes the functionality to drill down to the individual Customer records based on a Faceted Search by the following areas of Customer classification: Identity data Profile and segmentation data Behavioral data aggregated from customer interactions Preferences calculated by scoring and learning engines Source selection A User is able to filter on selected facet values through an Include/Exclude function, allowing for easy audience or item selection through a query-by-example interface.
12 WHAT S NEW IN LILY Additionally, a User is able to get down to view the results of the filter in the screen below. The resulting results will include the following information: ID, Identity, Behavior and Preference for the individual record result. The User is able to see ID, Identity, Behavior and Preference results on a individual record result.
13 WHAT S NEW IN LILY Item Explorer With the release of Lily 2.4, a User will be able to explore Item data through a Faceted Search as well. The User will begin the search through product categories, filter facets, and delving in the data of individual Items.
14 Lily Customer Intelligence Applications WHAT S NEW IN LILY Rules-Based Recommendation Strategies The Rules-Based Recommendation Strategies functionality has been enhanced to support Strategies through dynamic business rules for Pre / Post processing of Recommendations. This includes selection of recommendation engines, different models within the Recommendation Engines, bypassing recommendation engine consultation, as well as influencing scores based on decision tables. A variety of Business Rules may be defined specific to a Clients needs and the calculations or the activation / deactivation of individual rules may be made by a User without having to redeploy a new version of the application. Knowledge-Based Recommendations A new Knowledge-Based Recommendation Engine can utilize the knowledge about Users and Items and reasons out what products meet the Users requirements. This has been delivered with this release of Lily. It supports the configuration and application of custom-made, domain-specific recommendation scores based on aggregates and overall Customer interaction data. This functionality allows Users to determine recommendations for a Customer based solely on their behavior and not taking into account the interactions of other Customers, such as with the Lily Collaborative Filtering Machine Learning Recommendation Engine.
15 WHAT S NEW IN LILY Provisioning With the 2.4 release of Lily, a new provisioning framework has been created to assist in the installation and upgrade of Lily clusters. This framework is a Python solution based upon Fabric and is capable of installing an entire Lily cluster including Cloudera Manager and CDH4. By utilizing this tool, it is possible to create a Lily cluster on EC2 and have it running within 15 minutes. It replaces the existing Whirrbased installer and allows for a variety of deployment automations. For more information on Lily, please contact us at info@ngdata.com.
The Technology of the Business Data Lake. Appendix
The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform
More informationBlended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)
Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance
More informationSOLUTION TRACK Finding the Needle in a Big Data Innovator & Problem Solver Cloudera
SOLUTION TRACK Finding the Needle in a Big Data Haystack @EvaAndreasson, Innovator & Problem Solver Cloudera Agenda Problem (Solving) Apache Solr + Apache Hadoop et al Real-world examples Q&A Problem Solving
More informationOracle Big Data. A NA LYT ICS A ND MA NAG E MENT.
Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationData Lake Based Systems that Work
Data Lake Based Systems that Work There are many article and blogs about what works and what does not work when trying to build out a data lake and reporting system. At DesignMind, we have developed a
More informationBig Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours
Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals
More informationHow Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,
How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS
More informationData Storage Infrastructure at Facebook
Data Storage Infrastructure at Facebook Spring 2018 Cleveland State University CIS 601 Presentation Yi Dong Instructor: Dr. Chung Outline Strategy of data storage, processing, and log collection Data flow
More informationHortonworks DataPlane Service
Data Steward Studio Administration () docs.hortonworks.com : Data Steward Studio Administration Copyright 2016-2017 Hortonworks, Inc. All rights reserved. Please visit the Hortonworks Data Platform page
More informationEnterprise Data Catalog for Microsoft Azure Tutorial
Enterprise Data Catalog for Microsoft Azure Tutorial VERSION 10.2 JANUARY 2018 Page 1 of 45 Contents Tutorial Objectives... 4 Enterprise Data Catalog Overview... 5 Overview... 5 Objectives... 5 Enterprise
More informationOverview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::
Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional
More informationLecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018
Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018 K. Zhang (pic source: mapr.com/blog) Copyright BUDT 2016 758 Where
More informationBig Data Hadoop Stack
Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware
More informationGain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.
Gain Insights From Unstructured Data Using Pivotal HD 1 Traditional Enterprise Analytics Process 2 The Fundamental Paradigm Shift Internet age and exploding data growth Enterprises leverage new data sources
More informationInstructor : Dr. Sunnie Chung. Independent Study Spring Pentaho. 1 P a g e
ABSTRACT Pentaho Business Analytics from different data source, Analytics from csv/sql,create Star Schema Fact & Dimension Tables, kettle transformation for big data integration, MongoDB kettle Transformation,
More informationThe Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou
The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component
More informationOracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive
More informationFEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market
E TL VALIDATOR DATA SHEET FEATURES BENEFITS SUPPORTED PLATFORMS ETL Testing Automation Data Quality Testing Flat File Testing Big Data Testing Data Integration Testing Wizard Based Test Creation No Custom
More informationGetting Started With Intellicus. Version: 7.3
Getting Started With Intellicus Version: 7.3 Copyright 2015 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived
More informationImporting and Exporting Data Between Hadoop and MySQL
Importing and Exporting Data Between Hadoop and MySQL + 1 About me Sarah Sproehnle Former MySQL instructor Joined Cloudera in March 2010 sarah@cloudera.com 2 What is Hadoop? An open-source framework for
More informationOracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: +34916267792 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive data integration platform
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationProcessing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.
Processing Unstructured Data Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. http://dinesql.com / Dinesh Priyankara @dinesh_priya Founder/Principal Architect dinesql Pvt Ltd. Microsoft Most
More informationInnovatus Technologies
HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String
More informationConfiguring and Deploying Hadoop Cluster Deployment Templates
Configuring and Deploying Hadoop Cluster Deployment Templates This chapter contains the following sections: Hadoop Cluster Profile Templates, on page 1 Creating a Hadoop Cluster Profile Template, on page
More informationBlended Learning Outline: Cloudera Data Analyst Training (171219a)
Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills
More informationSQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism
Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and
More informationApache HAWQ (incubating)
HADOOP NATIVE SQL What is HAWQ? Apache HAWQ (incubating) Is an elastic parallel processing SQL engine that runs native in Apache Hadoop to directly access data for advanced analytics. Why HAWQ? Hadoop
More informationInformatica Enterprise Information Catalog
Data Sheet Informatica Enterprise Information Catalog Benefits Automatically catalog and classify all types of data across the enterprise using an AI-powered catalog Identify domains and entities with
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Spark Technology Overview and Streaming Workload Use Cases Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationCertified Big Data Hadoop and Spark Scala Course Curriculum
Certified Big Data Hadoop and Spark Scala Course Curriculum The Certified Big Data Hadoop and Spark Scala course by DataFlair is a perfect blend of indepth theoretical knowledge and strong practical skills
More informationTowards a Real- time Processing Pipeline: Running Apache Flink on AWS
Towards a Real- time Processing Pipeline: Running Apache Flink on AWS Dr. Steffen Hausmann, Solutions Architect Michael Hanisch, Manager Solutions Architecture November 18 th, 2016 Stream Processing Challenges
More informationShark: Hive (SQL) on Spark
Shark: Hive (SQL) on Spark Reynold Xin UC Berkeley AMP Camp Aug 21, 2012 UC BERKELEY SELECT page_name, SUM(page_views) views FROM wikistats GROUP BY page_name ORDER BY views DESC LIMIT 10; Stage 0: Map-Shuffle-Reduce
More informationBigInsights and Cognos Stefan Hubertus, Principal Solution Specialist Cognos Wilfried Hoge, IT Architect Big Data IBM Corporation
BigInsights and Cognos Stefan Hubertus, Principal Solution Specialist Cognos Wilfried Hoge, IT Architect Big Data 2013 IBM Corporation A Big Data architecture evolves from a traditional BI architecture
More informationOracle Big Data Fundamentals Ed 2
Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies
More informationHyperion Interactive Reporting Reports & Dashboards Essentials
Oracle University Contact Us: +27 (0)11 319-4111 Hyperion Interactive Reporting 11.1.1 Reports & Dashboards Essentials Duration: 5 Days What you will learn The first part of this course focuses on two
More informationCloudera Manager Quick Start Guide
Cloudera Manager Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this
More informationsqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010
sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010 Your database Holds a lot of really valuable data! Many structured tables of several hundred GB Provides fast access
More informationCloudera Introduction
Cloudera Introduction Important Notice 2010-2017 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, and any other product or service names or slogans contained in this document are trademarks
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationIntro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect
Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Igor Roiter Big Data Cloud Solution Architect Working as a Data Specialist for the last 11 years 9 of them as a Consultant specializing
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationProduct Compatibility Matrix
Compatibility Matrix Important tice (c) 2010-2014, Inc. All rights reserved., the logo, Impala, and any other product or service names or slogans contained in this document are trademarks of and its suppliers
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More informationCertified Big Data and Hadoop Course Curriculum
Certified Big Data and Hadoop Course Curriculum The Certified Big Data and Hadoop course by DataFlair is a perfect blend of in-depth theoretical knowledge and strong practical skills via implementation
More informationTalend Open Studio for MDM Web User Interface. User Guide 5.6.2
Talend Open Studio for MDM Web User Interface User Guide 5.6.2 Talend Open Studio for MDM Web User Interface Adapted for v5.6.2. Supersedes previous releases. Publication date: May 12, 2015 Copyleft This
More informationCERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)
CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationSecurity and Performance advances with Oracle Big Data SQL
Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationReport on The Infrastructure for Implementing the Mobile Technologies for Data Collection in Egypt
Report on The Infrastructure for Implementing the Mobile Technologies for Data Collection in Egypt Date: 10 Sep, 2017 Draft v 4.0 Table of Contents 1. Introduction... 3 2. Infrastructure Reference Architecture...
More information1Z Oracle Business Intelligence (OBI) Foundation Suite 11g Essentials Exam Summary Syllabus Questions
1Z0-591 Oracle Business Intelligence (OBI) Foundation Suite 11g Essentials Exam Summary Syllabus Questions Table of Contents Introduction to 1Z0-591 Exam on Oracle Business Intelligence (OBI) Foundation
More informationEnabling Secure Hadoop Environments
Enabling Secure Hadoop Environments Fred Koopmans Sr. Director of Product Management 1 The future of government is data management What s your strategy? 2 Cloudera s Enterprise Data Hub makes it possible
More informationCloudera Introduction
Cloudera Introduction Important Notice 2010-2018 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, and any other product or service names or slogans contained in this document are trademarks
More informationHadoop. Introduction / Overview
Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures
More informationA Review Approach for Big Data and Hadoop Technology
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 A Review Approach for Big Data and Hadoop Technology Prof. Ghanshyam Dhomse
More informationShark: Hive (SQL) on Spark
Shark: Hive (SQL) on Spark Reynold Xin UC Berkeley AMP Camp Aug 29, 2013 UC BERKELEY Stage 0:M ap-shuffle-reduce M apper(row ) { fields = row.split("\t") em it(fields[0],fields[1]); } Reducer(key,values)
More informationMAPR DATA GOVERNANCE WITHOUT COMPROMISE
MAPR TECHNOLOGIES, INC. WHITE PAPER JANUARY 2018 MAPR DATA GOVERNANCE TABLE OF CONTENTS EXECUTIVE SUMMARY 3 BACKGROUND 4 MAPR DATA GOVERNANCE 5 CONCLUSION 7 EXECUTIVE SUMMARY The MapR DataOps Governance
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationEsgynDB Enterprise 2.0 Platform Reference Architecture
EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationCurriculum Guide. ThingWorx
Curriculum Guide ThingWorx Live Classroom Curriculum Guide Introduction to ThingWorx 8 ThingWorx 8 User Interface Development ThingWorx 8 Platform Administration ThingWorx 7.3 Fundamentals Applying Machine
More informationGetting Started with Intellicus. Version: 16.0
Getting Started with Intellicus Version: 16.0 Copyright 2016 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived
More informationBig Data Hadoop Course Content
Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux
More informationBig Data with Hadoop Ecosystem
Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process
More informationOracle Enterprise Data Quality - Roadmap
Oracle Enterprise Data Quality - Roadmap Mike Matthews Martin Boyd Director, Product Management Senior Director, Product Strategy Copyright 2014 Oracle and/or its affiliates. All rights reserved. Oracle
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationIntegrating Big Data with Oracle Data Integrator 12c ( )
[1]Oracle Fusion Middleware Integrating Big Data with Oracle Data Integrator 12c (12.2.1.1) E73982-01 May 2016 Oracle Fusion Middleware Integrating Big Data with Oracle Data Integrator, 12c (12.2.1.1)
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationCOURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER
ABOUT THIS COURSE The focus of this five-day instructor-led course is on creating managed enterprise BI solutions. It describes how to implement multidimensional and tabular data models, deliver reports
More informationHow to choose the right approach to analytics and reporting
SOLUTION OVERVIEW How to choose the right approach to analytics and reporting A comprehensive comparison of the open source and commercial versions of the OpenText Analytics Suite In today s digital world,
More informationData Access 3. Managing Apache Hive. Date of Publish:
3 Managing Apache Hive Date of Publish: 2018-07-12 http://docs.hortonworks.com Contents ACID operations... 3 Configure partitions for transactions...3 View transactions...3 View transaction locks... 4
More informationHadoop An Overview. - Socrates CCDH
Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected
More informationMicrosoft SharePoint Server 2013 Plan, Configure & Manage
Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that
More informationHive SQL over Hadoop
Hive SQL over Hadoop Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction Apache Hive is a high-level abstraction on top of MapReduce Uses
More informationChase Wu New Jersey Institute of Technology
CS 644: Introduction to Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Institute of Technology Some of the slides were provided through the courtesy of Dr. Ching-Yung Lin at Columbia
More informationIntroduction to Big-Data
Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,
More informationThis is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.
About the Tutorial Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and
More informationHive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)
Hive and Shark Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Hive and Shark 1393/8/19 1 / 45 Motivation MapReduce is hard to
More informationMaking the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor
Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,
More informationCourse Contents: 1 Business Objects Online Training
IQ Online training facility offers Business Objects online training by trainers who have expert knowledge in the Business Objects and proven record of training hundreds of students Our Business Objects
More informationC_HANAIMP142
C_HANAIMP142 Passing Score: 800 Time Limit: 4 min Exam A QUESTION 1 Where does SAP recommend you create calculated measures? A. In a column view B. In a business layer C. In an attribute view D. In an
More informationOSR Administration 3.7 User Guide. Updated:
OSR Administration 3.7 User Guide Updated: 2013-01-31 Copyright OneStop Reporting AS www.onestopreporting.com Table of Contents Introduction... 1 Who should read this manual... 1 What s included in this
More informationInformation empowerment for your evolving data ecosystem
Information empowerment for your evolving data ecosystem Highlights Enables better results for critical projects and key analytics initiatives Ensures the information is trusted, consistent and governed
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationSAS. Information Map Studio 3.1: Creating Your First Information Map
SAS Information Map Studio 3.1: Creating Your First Information Map The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Information Map Studio 3.1: Creating Your
More informationMicrosoft Exam
Volume: 42 Questions Case Study: 1 Relecloud General Overview Relecloud is a social media company that processes hundreds of millions of social media posts per day and sells advertisements to several hundred
More informationHadoop Overview. Lars George Director EMEA Services
Hadoop Overview Lars George Director EMEA Services 1 About Me Director EMEA Services @ Cloudera Consulting on Hadoop projects (everywhere) Apache Committer HBase and Whirr O Reilly Author HBase The Definitive
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationOracle Big Data Fundamentals Ed 1
Oracle University Contact Us: +0097143909050 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationWhat does SAS Data Management do? For whom is SAS Data Management designed? Key Benefits
FACT SHEET SAS Data Management Transform raw data into a valuable business asset What does SAS Data Management do? SAS Data Management helps transform, integrate, govern and secure data while improving
More informationOracle BDA: Working With Mammoth - 1
Hello and welcome to this online, self-paced course titled Administering and Managing the Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled Working With Mammoth.
More informationTECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1
TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1 ABSTRACT This introductory white paper provides a technical overview of the new and improved enterprise grade features introduced
More informationPlanning and Administering SharePoint 2016
Planning and Administering SharePoint 2016 Course 20339A 5 Days Instructor-led, Hands on Course Information This five-day course will combine the Planning and Administering SharePoint 2016 class with the
More information