Lily 2.4 What s New Product Release Notes

Size: px

Start display at page:

Download "Lily 2.4 What s New Product Release Notes"

William Bruce
5 years ago
Views:

1 Lily 2.4 What s New Product Release Notes

2 WHAT S NEW IN LILY Table of Contents Table of Contents... 2 Purpose and Overview of this Document... 3 Product Overview... 4 General... 5 Prerequisites... 5 The Lily Data Repository... 6 Side Effect Processor... 6 Multi-Repository Support... 6 Metadata... 6 Record Type Inheritance... 5 HDFS Name Node Support... 8 The Lily Customer Database... 9 Foundations... 9 Master Records... 9 Hive Integration... 9 Interaction Summary Attribute Calculation Engine Lily Customer Database Explorer User Interface Item Explorer Lily Customer Intelligence Applications Rules - Based Recommendation Strategies Knowledge Based Recommendations Provisioning... 15

3 WHAT S NEW IN LILY Purpose and Overview of this Document This document serves as a comprehensive, Customer-oriented overview of all feature additions, changes and enhancements available in Lily Release 2.4. Lily 2.4 has been released the 22nd of July 2013 with the main release focus being updates to the Lily Customer Database and the Lily Data Repository.

4 WHAT S NEW IN LILY Product Overview Lily, the Customer Intelligence Platform, Release 2.4 is composed of the following products: Lily Data Repository (DR) Lily Customer Database (CDB) Lily Customer Intelligence Applications (CA) Lily Enterprise tools: o Cluster installation o ETL tool connectivity o Hive connectivity The Lily Customer Database relies upon the Data Repository, and the Customer Applications on the Customer Database.

5 WHAT S NEW IN LILY General Prerequisites Lily 2.4 has been certified against Cloudera s Distribution Including Apache Hadoop (CDH4) and therefore requires the following: CDH4.2 o Supported Operating Systems: docs/cdh4/4.2.0/cdh4-requirements-and-supported- Versions/cdhrsv_topic_1.html o Supported Databases: content/cloudera-docs/cdh4/4.2.0/cdh4-requirements-and- Supported-Versions/cdhrsv_topic_2.html o Support JDK: content/cloudera-docs/cdh4/4.2.0/cdh4-requirements-and- Supported-Versions/cdhrsv_topic_3.html

6 WHAT S NEW IN LILY The Lily Data Repository With Release 2.4 of the Lily Data Repository a number of key enhancements and expansion of functionality has been included. By delivering these enhancements NGDATA has strengthened and expanded the foundation of the Lily Data Repository, for all users and continuing to provide the best-in-breed solution for all clients. Side Effect Processor A new component, the Side Effect Processor (SEP) has been developed for the Data Repository allowing for greater processing efficiency of asynchronous update operations The SEP sits at the core of Lily s triggering and indexing mechanism and replaces parts of the existing RowLog engine. With this enhancement, Lily is three times faster with indexing switched off, and five times faster when using indexing. Also, maintenance of multiple indexes at the same time bears no additional performance impact. The SEP is also being used for computing aggregates and maintaining additional data indexes in the Customer Database. Multi-Repository Support With this release, records retained within the Lily Data Repository can be stored across multiple Lily Repositories within HBase, thereby supporting the logical segregation of data providing additional performance gains and data security. It is possible to host multiple data repositories in one shared installation, e.g. to operate a shared Lily for multiple departments. Metadata Lily has been enhanced with this release to support field level metadata. This metadata may be comprised of key / value pairs where key is a string and a value is a simple type, such as String, Integer, Long, Float, Double, Boolean, or byte. Providing this functionality to the Lily Data Repository allows Users to identify information such as Source, Creator, Timestamp, Quality, etc. without having to predefine this field in the schema and therefore making the addition of this data by an application simpler. Amongst others, field metadata may be utilized to store access control information.

7 WHAT S NEW IN LILY Record Type Inheritance When creating a schema definition the Lily Data Repository functionality now supports the concept of Supertypes. This allows a User to define a base record type and then Inherit the structure of that base type into another record type that then extends it. As an example: recordtypes: [ { name: person, fields : [ { name: name, mandatory: true }, { name: address, mandatory: true} ] }, { name: Customer, fields: [ { name: Customer Number, mandatory: true } ], supertypes: [ { name: person } ] }, { name: employee, fields: [ { name: employeenumber, mandatory: true } ], supertypes: [ { name: person } ] } ]

8 WHAT S NEW IN LILY In this example, the Customer record will have fields Name, Address, and Customer Number while the Employee record will have Name, Address, and Employee Number. By utilizing the recordtype inheritance, schema design has become much more elegant. HDFS Name Node Support Lily 2.4 Enterprise ships with deployment configuration and instructions to support high-availability of the Hadoop HDFS Name Node service. This allows enterprise clients to run Hadoop in a fault-tolerant configuration and be able to maintain system uptime in the event of server failure.

9 WHAT S NEW IN LILY The Lily Customer Database Foundations At the foundation of the Lily Customer Database, with this release the two enhancements are being delivered: Interaction Timestamps o Lily will add a Timestamp to all Interactions, when the Interaction Timestamp has not been specified on the input data, thereby ensuring the Timestamp field will always contain a value Automatic Creation of Item or Customer o Lily will add a new Item and/or Customer record when an Interaction is logged for a non-existent Item and/or Customer ensuring 100% of the Interactions are prepared for further analysis Master Records An enhancement to the Lily Customer Database allows Users to specify that Customer records from multiple sources are actually for the same Customer. When this occurs, a Master Customer record is created to combine all of the data from the different sources. Each Source Record is kept intact for auditing purposes, but a Master Record is updated to reflect a combination of the most recent data. This is beneficial because it allows Lily Users to get a single view of their Customers in a central place based on data from different systems. Hive Integration Expanding upon the existing Lily Hive integration, a number of enhancements and added features have been developed and included in Release 2.4, which are highlighted below: Generate SQL like Queries called HiveQL for searching data in the Customer Database through a wizard on the Lily User Interface o This User Interface allows a User to generate a Hive query that which can be run offline o This allows the query to be copied into and utilized inside popular BI or data exploration tools supporting Hive such as Tableau, Toad4Cloud,

10 WHAT S NEW IN LILY Search data in Lily via Hive using field level metadata Store data set statistics in the HDFS layer of Lily By expanding the Hive integration NGDATA continues to improve the ability of Users to do analysis on the data that is stored in Lily. Interaction Summary Attribute Calculation Engine The Lily Customer Database has been updated with an engine for both real-time & batch calculation of Summary Attributes that can be defined by the Client, such as: amount of visits, average amount spent, time on website, etc. Lily supports the following basic aggregators: Max, Min, Average, Count, Distinct Count, Last, Sum. Summary Attributes are calculated values, which are based on the analysis of incoming/processed interactions. Summaries Attributes are always stored on a single Customer or Item record, which is comparable to an SQL query with an Aggregation (e.g. SUM) and a GROUP BY Customer. The real-time functionality of this Variable Calculation Engine has been developed as a SEP component to compute the Summary Attributes in real-time during ingestion of the data. The calculated values can also be used in the Lily Customer Database Explorer for Facet Search, allowing for a richer User experience in the User Interface and a continued expansion of the functionality of Lily as a data exploration tool. The usage of the Summary Attributes within the Faceted Search functionality can be enabled/disabled on a Summary Attribute level. In addition to the real-time functionality, Lily also supports a batch-based rebuild of all variables using the parallel processing power of MapReduce. Lily Customer Database Explorer The Lily Customer Database Explorer has been enhanced with the functionality to support facet-based searches using graph widgets (bar chart, pie chart ). This type of User Interface navigation provides the User visual insights into the Lily Customer Database, allowing a step-by-step creation of Customer or Item target views. The Lily Customer Database Explorer also makes it possible to select one or more Segments to aid in segment analysis by utilizing interactive usage & selection within the of the graph-based User Interface widgets.

11 WHAT S NEW IN LILY The graph-based widgets utilize faceted data as input, provided by the Lily Customer Database and support the following types of facets: Aggregated Facets: These type of facets make use of interaction summary attributes, like shopping frequency, click count, Field Facets: These type of facets are not using summarized data but basic field values like gender, region, User Interface With the delivery of Release 2.4, the Customer Database Explorer includes the functionality to drill down to the individual Customer records based on a Faceted Search by the following areas of Customer classification: Identity data Profile and segmentation data Behavioral data aggregated from customer interactions Preferences calculated by scoring and learning engines Source selection A User is able to filter on selected facet values through an Include/Exclude function, allowing for easy audience or item selection through a query-by-example interface.

12 WHAT S NEW IN LILY Additionally, a User is able to get down to view the results of the filter in the screen below. The resulting results will include the following information: ID, Identity, Behavior and Preference for the individual record result. The User is able to see ID, Identity, Behavior and Preference results on a individual record result.

13 WHAT S NEW IN LILY Item Explorer With the release of Lily 2.4, a User will be able to explore Item data through a Faceted Search as well. The User will begin the search through product categories, filter facets, and delving in the data of individual Items.

14 Lily Customer Intelligence Applications WHAT S NEW IN LILY Rules-Based Recommendation Strategies The Rules-Based Recommendation Strategies functionality has been enhanced to support Strategies through dynamic business rules for Pre / Post processing of Recommendations. This includes selection of recommendation engines, different models within the Recommendation Engines, bypassing recommendation engine consultation, as well as influencing scores based on decision tables. A variety of Business Rules may be defined specific to a Clients needs and the calculations or the activation / deactivation of individual rules may be made by a User without having to redeploy a new version of the application. Knowledge-Based Recommendations A new Knowledge-Based Recommendation Engine can utilize the knowledge about Users and Items and reasons out what products meet the Users requirements. This has been delivered with this release of Lily. It supports the configuration and application of custom-made, domain-specific recommendation scores based on aggregates and overall Customer interaction data. This functionality allows Users to determine recommendations for a Customer based solely on their behavior and not taking into account the interactions of other Customers, such as with the Lily Collaborative Filtering Machine Learning Recommendation Engine.

15 WHAT S NEW IN LILY Provisioning With the 2.4 release of Lily, a new provisioning framework has been created to assist in the installation and upgrade of Lily clusters. This framework is a Python solution based upon Fabric and is capable of installing an entire Lily cluster including Cloudera Manager and CDH4. By utilizing this tool, it is possible to create a Lily cluster on EC2 and have it running within 15 minutes. It replaces the existing Whirrbased installer and allows for a variety of deployment automations. For more information on Lily, please contact us at info@ngdata.com.

The Technology of the Business Data Lake. Appendix

The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform