Smart Data Catalog DATASHEET

Similar documents
GDPR Data Discovery and Reporting

WHITE PAPER: TOP 10 CAPABILITIES TO LOOK FOR IN A DATA CATALOG

SOLUTION OVERVIEW: DATA CATALOGS FOR RISK AND COMPLIANCE

SOLUTION OVERVIEW: DATA CATALOGS FOR DATA RATIONALIZATION

Informatica Enterprise Information Catalog

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

Analytics & Sport Data

WHITE PAPER: USING AI AND MACHINE LEARNING TO POWER DATA FINGERPRINTING

Getting personal with your customers and GDPR

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

Hortonworks DataPlane Service

Modern Data Warehouse The New Approach to Azure BI

New Features and Enhancements in Big Data Management 10.2

Enterprise Data Catalog Fixed Limitations ( Update 1)

Information empowerment for your evolving data ecosystem

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

Enterprise Data Catalog for Microsoft Azure Tutorial

IBM Data Replication for Big Data

This document contains important information about Emergency Bug Fixes in Informatica Service Pack 1.

Solving the Enterprise Data Dilemma

How to Run the Big Data Management Utility Update for 10.1

Data Governance: Data Usage Labeling and Enforcement in Adobe Cloud Platform

Configuring Intelligent Streaming 10.2 For Kafka on MapR

An Oracle White Paper October 12 th, Oracle Metadata Management v New Features Overview

Using Cohesity with Amazon Web Services (AWS)

Enabling Secure Hadoop Environments

Hortonworks and The Internet of Things

Hitachi Vantara Overview Pentaho 8.0 and 8.1 Roadmap. Pedro Alves

Security and Performance advances with Oracle Big Data SQL

Is NiFi compatible with Cloudera, Map R, Hortonworks, EMR, and vanilla distributions?

Datameer for Data Preparation:

I CAN T FIND THE #$%& DATA. Why You Need a Data Catalog

Data Governance Overview

What does SAS Data Management do? For whom is SAS Data Management designed? Key Benefits

Hadoop. Introduction / Overview

The TIBCO Insight Platform 1. Data on Fire 2. Data to Action. Michael O Connell Catalina Herrera Peter Shaw September 7, 2016

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Informatica Cloud Data Integration Spring 2018 April. What's New

Data Governance Data Usage Labeling and Enforcement in Adobe Experience Platform

COPYRIGHT DATASHEET

Big Data with Hadoop Ecosystem

BI ENVIRONMENT PLANNING GUIDE

Understanding Cumulus Deployment Options Enterprise DAM On-Premise, in the Cloud or a Hybrid Approach

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Modernizing Business Intelligence and Analytics

Data Center Management and Automation Strategic Briefing

Informatica Data Lake Management on the AWS Cloud

ODPi and Data Governance Free Your MetaData! October 10, 2018

HDP Security Overview

HDP Security Overview

GOVERNING HADOOP (AND THE DATA LAKE)

IS THE DATA CATALOG A METADATA MANAGEMENT RELOADED?

Cloud Analytics and Business Intelligence on AWS

IT directors, CIO s, IT Managers, BI Managers, data warehousing professionals, data scientists, enterprise architects, data architects

HDInsight > Hadoop. October 12, 2017

CAN MICROSOFT HELP MEET THE GDPR

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Improving Your Business with Oracle Data Integration See How Oracle Enterprise Metadata Management Can Help You

DATA SHEET AlienVault USM Anywhere Powerful Threat Detection and Incident Response for All Your Critical Infrastructure

Introduction to Cloudbreak

IBM InfoSphere Information Analyzer

Fast Innovation requires Fast IT

Spotfire for the Enterprise: An Overview for IT Administrators

Spotfire: Brisbane Breakfast & Learn. Thursday, 9 November 2017

How to Hadoop effortlessly with Waterline Data Inventory

Compact Solutions Connector FAQ

Spotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

Oracle Big Data Connectors

Oracle GoldenGate for Big Data

Informatica Data Quality Product Family

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Match data set availability to data resource requirements, including gap analysis and remediation assistance.

Application of machine learning and big data technologies in OpenAIRE system

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

How to choose the right approach to analytics and reporting

SAP Agile Data Preparation Simplify the Way You Shape Data PUBLIC

Sandbox Setup Guide for HDP 2.2 and VMware

Understanding the latent value in all content

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

PERSPECTIVE. Effective Data Governance. Abstract

Metadata and the Rise of Big Data Governance: Active Open Source Initiatives. October 23, 2018

Liferay Security Features Overview. How Liferay Approaches Security

The Business Value of Metadata for Data Governance: The Challenge of Integrating Packaged Applications

SCALABLE DISTRIBUTED DEEP LEARNING

WHITEPAPER. MemSQL Enterprise Feature List

Informatica Cloud Spring Hadoop Connector Guide

Stages of Data Processing

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

Automated Netezza to Cloud Migration

Please give me your feedback

Modern ETL Tools for Cloud and Big Data. Ken Beutler, Principal Product Manager, Progress Michael Rainey, Technical Advisor, Gluent Inc.

The Need for Big Data Governance

What is Gluent? The Gluent Data Platform

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

QLIK INTEGRATION WITH AMAZON REDSHIFT

Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility. AWS Whitepaper

An Introduction to Big Data Formats

Managing Security While Driving Digital Transformation

Transcription:

DATASHEET Smart Data Catalog There is so much data distributed across organizations that data and business professionals don t know what data is available or valuable. When it s time to create a new report or a dashboard for the CxO, or you are trying to respond to a request from a government regulator, or you are just trying to figure out if you can eliminate some of the excess data in your organization, the first step is usually a mad scramble to understand your existing data environment. Business Professionals: Spend less time searching and more time working. Waterline makes it easy to quickly search for and find the high quality data you need to do your job. The Waterline Smart Data Catalog discovers and raises trusted data above the waterline so you have the data you need to effectively run your organization. We automate the discovery, matching and tagging process and ensure the catalog is always up to date by incrementally scanning the data itself. Governance Professionals: Waterline automates the application of compliance policies by integrating tagged data with your security infrastructure Data Professionals: Reduces the time you spend manually discovering, tagging and organizing data so you can get new data to business users more quickly. Discover: Automatically and incrementally fingerprint data and infers data lineage at scale by analyzing actual data values Compliance: Map your compliance policies to your data assets: acceptable use, legal holds, expiry. Search: Search for data through the Waterline GUI or through integration via 3rd party applications Organize: Uses machine learning to automatically tag and match data fingerprints to glossary terms Curate: Human reviewers accept or reject tags, machine learning fine tunes the tagging process and improves the matching algorithm Reporting: Simplify ongoing mandated reporting as the catalog uncovers dark data and catalogs it dynamically Access control: Automate data access control via tag based security Rate & Collaborate: Users collaborate to create subjective crowdsourced ratings/reviews which combined with objective profiling metadata provides users with a view into data quality and usefulness. DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 1

Get Value in days, not weeks or months: You have thousands of datasets with millions of distinct data fields across your company and that number is growing every day. Manually documenting your catalog isn t an option! Waterline Data automatically catalogs all your data assets so you get value from your data catalog right out of the box. Reduces manual tagging of data by over 80% Reduces manual tagging of data by over 80%. Waterline Data Fingerprinting combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale Data stewards accept/reject automatically suggested tags and the system learns, fine tunes and improves the matching algorithm Works natively on Hadoop and Spark to easily scale to handle all your data Works seamlessly across a wide variety of data sources (relational, files, Hadoop, etc.) because you never know where the most important data is located DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 2

Self-service accelerates time to value You re a business professional, and when you have questions, you need reliable answers, but where is the right data, and who do you ask? Waterline consolidates your tribal data knowledge and makes it easy to share with others so you and your colleagues can quickly find the data you need. Business users can easily find and share the right data Easy to use web search interface, with facets and filters, designed specifically for business users to search a catalog of trusted, curated data Search directly from your existing data wrangling and visualization tools integrated through our REST APIs Crowd source annotations and view the comments of other users to capture tribal knowledge and establish trusted data sources Automatically propagates data tags so users can easily find similar data across all sources: Hadoop, Hive, relational, cloud, etc.reduces manual tagging of data by over 80%. Waterline Data Fingerprinting combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 3

Govern your data with agility Data Governance isn t one size fits all. We provide the appropriate level of governance for whatever type of data is being managed. Simplifies data governance by delivering a truly scalable, automated and dynamic process for identifying sensitive data, capturing data lineage, and ensuring proper data use and access User and Role management ensures proper data access for sensitive data and integrates directly with Apache Ranger and Cloudera Sentry to enable tag based access control Allows data stewards to efficiently manage tagging rules, curate the data catalog, and manage proper access to data Auditing provides full traceability for how all users have tagged, curated, commented and searched for data within the data catalog Data Governance isn t one size fits all. DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 4

Architecture The Waterline Smart Data Catalog runs both on premise or in a variety of cloud environments. Additionally, almost every user interface that you see is available as a REST API, which makes it easy to integrate the data catalog with existing metadata sources, add new data sources as well as incorporate the catalog as part of a larger data workflow process. Waterline Smart Data Catalog can run within the following execution environments: Cloudera CDH, Hortonworks HDP, Amazon EMR, MapR, Infosys IIP, Microsoft Azure Waterline Smart Data Catalog can connect to the following data sources: HDFS, HIVE, Teradata, Oracle, MySQL, MSSQL, Redshift, S3, MS ADLS, MS Blob. Any JDBC connected relational data store can also be quickly added. waterlinedata.com Sales Technical Support Corporate Headquarters sales@waterlinedata.com Visit the Support Center 201 San Antonio Circle Suite 260 help@waterlinedata.com Mountain View CA 94040 DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 (650) 946-2104 5