Datameer for Data Preparation:

Similar documents
Datameer Big Data Governance. Bringing open-architected and forward-compatible governance controls to Hadoop analytics

Data Management Glossary

Enabling Secure Hadoop Environments

New in This Version. Numeric Filtergram

SAP Agile Data Preparation Simplify the Way You Shape Data PUBLIC

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

HDP Security Overview

HDP Security Overview

Governance Best Practices With Datameer WHITE PAPER

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

See Types of Data Supported for information about the types of files that you can import into Datameer.

Enterprise Data Catalog Fixed Limitations ( Update 1)

alteryx training courses

Information empowerment for your evolving data ecosystem

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

Data in the Cloud and Analytics in the Lake

Oracle Big Data Connectors

Data Preparation for Enhancing Modern BI: Common Design Patterns

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Microsoft Power BI for O365

Capability White Paper Straight-Through-Processing (STP)

What's New in SAS Data Management

Self-Service Data Preparation for Qlik. Cookbook Series Self-Service Data Preparation for Qlik

The Definitive Guide to Preparing Your Data for Tableau

Enterprise Data Catalog for Microsoft Azure Tutorial

How to choose the right approach to analytics and reporting

Smart Data Catalog DATASHEET

Logi Info v12.5 WHAT S NEW

Optimized Data Integration for the MSO Market

Six Core Data Wrangling Activities. An introductory guide to data wrangling with Trifacta

Copyright 2016 Datalynx Pty Ltd. All rights reserved. Datalynx Enterprise Data Management Solution Catalogue

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Security and Performance advances with Oracle Big Data SQL

Big Data security, tools and tips to protect information assets

A detailed comparison of EasyMorph vs Tableau Prep

Oracle 1Z Oracle Big Data 2017 Implementation Essentials.

6 SSIS Expressions SSIS Parameters Usage Control Flow Breakpoints Data Flow Data Viewers

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

SOLUTION BRIEF BIG DATA SECURITY

What s new in Excel 2013? Provided by Work Smart

WHAT S NEW IN QLIKVIEW 11

Hortonworks DataPlane Service

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

MicroStrategy Academic Program

Stages of Data Processing

SAP BW 3.5 Enhanced Reporting Capabilities SAP AG

Informatica Enterprise Information Catalog

What s New in Spotfire DXP 1.1. Spotfire Product Management January 2007

Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP

Excel and Tableau. A Beautiful Partnership. Faye Satta, Senior Technical Writer Eriel Ross, Technical Writer

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

The Technology of the Business Data Lake. Appendix

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Oracle Big Data Discovery

Working with Workbooks

Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2

Talend Data Preparation Free Desktop. Getting Started Guide V2.1

Export out report results in multiple formats like PDF, Excel, Print, , etc.

COPYRIGHT DATASHEET

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Using Cohesity with Amazon Web Services (AWS)

EZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore

User Guide. Data Preparation R-1.1

Value of Data Transformation. Sean Kandel, Co-Founder and CTO of Trifacta

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

User Guide. Data Preparation R-1.0

Evaluating Cloud Databases for ecommerce Applications. What you need to grow your ecommerce business

Hortonworks Data Platform

Data Governance Overview

Oracle Cloud E

Oracle Big Data Discovery

Hortonworks DataFlow Sam Lachterman Solutions Engineer

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Modern Data Warehouse The New Approach to Azure BI

SAS (Statistical Analysis Software/System)

The Now Platform Reference Guide

Power Query for Parsing Data

NGS-I. 7 New Generation Software, Inc N. Freeway Blvd., Ste. 200 Sacramento, CA

Intellicus Enterprise Reporting and BI Platform

Introduction to Data Science

IBM Software IBM InfoSphere Information Server for Data Quality

Technical Sheet NITRODB Time-Series Database

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

$99.95 per user. SQL Server 2008 Integration Services CourseId: 158 Skill level: Run Time: 42+ hours (210 videos)

DOWNLOAD OR READ : EXCEL DATA ANALYSIS YOUR VISUAL BLUEPRINT FOR CREATING AND ANALYZING DATA CHARTS AND PIVOT TABLES 3 PDF EBOOK EPUB MOBI

What does SAS Data Management do? For whom is SAS Data Management designed? Key Benefits

I CAN T FIND THE #$%& DATA. Why You Need a Data Catalog

Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value

Designing your BI Architecture

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Chapter 6 VIDEO CASES

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

SAS (Statistical Analysis Software/System)

WHITE PAPER. The General Data Protection Regulation: What Title It Means and How SAS Data Management Can Help

Designing dashboards for performance. Reference deck

DATA STEWARDSHIP BODY OF KNOWLEDGE (DSBOK)

Qlik Sense Desktop. Data, Discovery, Collaboration in minutes. Qlik Sense Desktop. Qlik Associative Model. Get Started for Free

10 WAYS DATA PREPARATION CAN HELP EXCEL

Transcription:

Datameer for Data Preparation: Explore, Profile, Blend, Cleanse, Enrich, Share, Operationalize

DATAMEER FOR DATA PREPARATION: EXPLORE, PROFILE, BLEND, CLEANSE, ENRICH, SHARE, OPERATIONALIZE Datameer Datameer for Data Preparation Datameer offers the complete set of capabilities you need for collaborative, operational data preparation pipelines to serve the needs of both your data and business analysts. With its data preparation functionality, Datameer allows you to: Explore and profile data visually Cleanse data to ensure validity Blend a variety of data formats Enrich and transform data Shape and organize datasets Share and collaborate datasets Operationalize data pipelines Govern with an open-secure model Explore and Profile Data Visually Explore all types of data Reach the next level in data discovery. With Datameer, you can access and use any type of data, anywhere in the world. Freely upload and parse any file formats for analysis, including: HTML JSON XML CSV Text documents Apache logs Pre-built data connectors and parsers Extract from 70+ data sources and a variety of pre-built data formats REST API allows any new file formats that suit your needs Extract custom structures Examples include custom data parsing, tokenizing for text parsing and time zone support Extract matching patterns with advanced parsing through regular expressions PAGE 2

Datameer DATA SHEET Profile data visually Gain a new level of understanding. With Datameer Flipside, you have instantaneous statistical data profiling of your data at any step in the preparation and analytic process. Now you can quickly determine the next preparation steps for your data which means a more accurate view of the data. Flipside also allows you to uncover hidden patterns. And with better profiling, you ll have a better sense of the data s accuracy. More accuracy and more recently refreshed data provides better insights. PAGE 3

DATAMEER FOR DATA PREPARATION: EXPLORE, PROFILE, BLEND, CLEANSE, ENRICH, SHARE, OPERATIONALIZE Datameer Cleanse Data to Ensure Validity Improve data quality Who said data cleansing had to be tedious? With Datameer, you can quickly identify outliers, duplicates, inconsistent values and more. Identify outliers Use Flipside data profiles to gain visual feedback. Handle duplicates Use a formula for de-duplication. Remove inconsistencies Filter missing values, nulls, blanks and missed cardinalities. Custom rules for cleansing Use the formula builder for any advanced patterns in the dataset. It identifies edge cases or unique patterns in the data and whether they need to be corrected. Blend a Variety of Data Formats Use additional data or explore it in different ways. After completing the data preparation workflow, you can blend datasets or embed meaningful data elements for future analysis. Join multiple datasets Join disparate sources without much data lineage Blend data easily with inner, outer, self and range JOINs. Multiple JOINs can be chained in one step (e.g. multi-lookups). Blend a variety of data formats Easily blend structured with unstructured data Union feature helps you simply append datasets to each other without a cardinality requirement. Many multi-structured data sources do not have any obvious matching join keys and the formula builder can help perform fuzzy joins in such scenarios. PAGE 4

Datameer DATA SHEET Enrich and Transform Data Understand data relationships and the meaning of data. For the analyst, data conditioning and enrichment are an important part of the cycle. Use sophisticated transformations Time saving Column splitting Column and row pivoting Statistical grouping Advanced text parsing Working with lists (concatenate fields to lists and expand to rows) Path construction Powerful if and comparison functions Date, time and text manipulation functions Enrich datasets inline with support for expressions Analytic enrichment Create valuable insights from data to complete the journey of data prep and data discovery. Datameer provides a rich array of capabilities to enrich datasets with analytics including: Path analysis Graph analytics Statistical functions Clustering Correlations Decision trees Text mining Sentiment analysis These are all available via Datameer Smart Analytics PAGE 5

DATAMEER FOR DATA PREPARATION: EXPLORE, PROFILE, BLEND, CLEANSE, ENRICH, SHARE, OPERATIONALIZE Datameer Shape and Organize Datasets Go beyond standard slicing-and-dicing of data Find hidden patterns and trends in the data. Organize data during the data preparation phase in different ways than with typical dimensions and metrics used by standard analytics. The Datameer function API allows users to write functions for domain specific transformations. Use built-in functions to organize and shape data Sessionization Custom binning Time windowing Advanced statistical grouping Function API to create domain-rich datasets Drive analytical use cases Fraud Preventative maintenance Buying patterns Clickstream analysis Time-series analysis And more Share and Collaborate Datasets Browse through files Organize and search for different data artifacts used across the enterprise Organize files Allow access to various folders across the organization, depending on projects and teams working on it Manage access to these artifacts for files or entire folders PAGE 6

Datameer DATA SHEET Reusable data artifacts Can be any data prep component such as: Connection to a data source Import job with raw data Workbook with data prep logic or a prepared dataset Chart or visualization published for sharing with any consumers of data Entire data prep workflow with chained workbooks Operationalize Data Pipelines Answering modern business questions isn t a one-time affair. As new data arrives, answers to the same question need to be continuously fed to business teams so they can perform timely, accurate actions. Automate custom data preparation steps Datameer gives you a full suite of capabilities to operationalize your data preparation pipelines to deliver results to downstream analysts and business teams. This includes: Sophisticated job scheduling and management Adaptable data retention and management policies A scalable, intelligent execution framework Full lineage, auditing and logging Complete security, including encryption and masking PAGE 7

DATAMEER FOR DATA PREPARATION: EXPLORE, PROFILE, BLEND, CLEANSE, ENRICH, SHARE, OPERATIONALIZE Datameer Govern With an Open-Secure Model For effective data-preparation pipelines and deeper analytic insights, organizations need a highly collaborative environment. Sharing both analytic datasets and content enables analysts to cast wide question nets while ensuring data and results are in line with corporate accuracy standards and compliance requirements. Maintain compliance and consistency Datameer Enterprise User Management LDAP / AD Integration SAML / Single Sign-On Permissions and sharing Role-based security (with custom roles) Kerberos Integration Sheet dependency graph Audit & data volume logs Data retention policies Secure impersonation to HDFS, YARN, Hive (Sentry), HBase Basic Sentry integration Encrypted metadata (with key rotation) Datameer maintains full lineage, auditing the definition of and any changes to artifacts and calculations, and supporting even the most rigid compliance processes. Datameer also includes a deep suite of encryption and obfuscation features, allowing you to keep data such as Personally Identifiable Information (PII) safe and secure using SHA1 plugins for obfuscation. Migration tools are bundled from development to production. Fine-grained, role-based security Datameer ensures full control over the data and models. Datameer also keeps a full catalog of datasets, models and derived content, and enables sharing with virtual team members across the organization. Irrespective of your analyst team s organization centralized, virtual or distributed Datameer has the complete suite of data preparation capabilities needed for cleansing, modeling, enrichment, collaboration, governance and operationalization. To learn more, please visit our website at www.datameer.com. PAGE 8