DATASHEET Smart Data Catalog There is so much data distributed across organizations that data and business professionals don t know what data is available or valuable. When it s time to create a new report or a dashboard for the CxO, or you are trying to respond to a request from a government regulator, or you are just trying to figure out if you can eliminate some of the excess data in your organization, the first step is usually a mad scramble to understand your existing data environment. Business Professionals: Spend less time searching and more time working. Waterline makes it easy to quickly search for and find the high quality data you need to do your job. The Waterline Smart Data Catalog discovers and raises trusted data above the waterline so you have the data you need to effectively run your organization. We automate the discovery, matching and tagging process and ensure the catalog is always up to date by incrementally scanning the data itself. Governance Professionals: Waterline automates the application of compliance policies by integrating tagged data with your security infrastructure Data Professionals: Reduces the time you spend manually discovering, tagging and organizing data so you can get new data to business users more quickly. Discover: Automatically and incrementally fingerprint data and infers data lineage at scale by analyzing actual data values Compliance: Map your compliance policies to your data assets: acceptable use, legal holds, expiry. Search: Search for data through the Waterline GUI or through integration via 3rd party applications Organize: Uses machine learning to automatically tag and match data fingerprints to glossary terms Curate: Human reviewers accept or reject tags, machine learning fine tunes the tagging process and improves the matching algorithm Reporting: Simplify ongoing mandated reporting as the catalog uncovers dark data and catalogs it dynamically Access control: Automate data access control via tag based security Rate & Collaborate: Users collaborate to create subjective crowdsourced ratings/reviews which combined with objective profiling metadata provides users with a view into data quality and usefulness. DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 1
Get Value in days, not weeks or months: You have thousands of datasets with millions of distinct data fields across your company and that number is growing every day. Manually documenting your catalog isn t an option! Waterline Data automatically catalogs all your data assets so you get value from your data catalog right out of the box. Reduces manual tagging of data by over 80% Reduces manual tagging of data by over 80%. Waterline Data Fingerprinting combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale Data stewards accept/reject automatically suggested tags and the system learns, fine tunes and improves the matching algorithm Works natively on Hadoop and Spark to easily scale to handle all your data Works seamlessly across a wide variety of data sources (relational, files, Hadoop, etc.) because you never know where the most important data is located DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 2
Self-service accelerates time to value You re a business professional, and when you have questions, you need reliable answers, but where is the right data, and who do you ask? Waterline consolidates your tribal data knowledge and makes it easy to share with others so you and your colleagues can quickly find the data you need. Business users can easily find and share the right data Easy to use web search interface, with facets and filters, designed specifically for business users to search a catalog of trusted, curated data Search directly from your existing data wrangling and visualization tools integrated through our REST APIs Crowd source annotations and view the comments of other users to capture tribal knowledge and establish trusted data sources Automatically propagates data tags so users can easily find similar data across all sources: Hadoop, Hive, relational, cloud, etc.reduces manual tagging of data by over 80%. Waterline Data Fingerprinting combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 3
Govern your data with agility Data Governance isn t one size fits all. We provide the appropriate level of governance for whatever type of data is being managed. Simplifies data governance by delivering a truly scalable, automated and dynamic process for identifying sensitive data, capturing data lineage, and ensuring proper data use and access User and Role management ensures proper data access for sensitive data and integrates directly with Apache Ranger and Cloudera Sentry to enable tag based access control Allows data stewards to efficiently manage tagging rules, curate the data catalog, and manage proper access to data Auditing provides full traceability for how all users have tagged, curated, commented and searched for data within the data catalog Data Governance isn t one size fits all. DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 4
Architecture The Waterline Smart Data Catalog runs both on premise or in a variety of cloud environments. Additionally, almost every user interface that you see is available as a REST API, which makes it easy to integrate the data catalog with existing metadata sources, add new data sources as well as incorporate the catalog as part of a larger data workflow process. Waterline Smart Data Catalog can run within the following execution environments: Cloudera CDH, Hortonworks HDP, Amazon EMR, MapR, Infosys IIP, Microsoft Azure Waterline Smart Data Catalog can connect to the following data sources: HDFS, HIVE, Teradata, Oracle, MySQL, MSSQL, Redshift, S3, MS ADLS, MS Blob. Any JDBC connected relational data store can also be quickly added. waterlinedata.com Sales Technical Support Corporate Headquarters sales@waterlinedata.com Visit the Support Center 201 San Antonio Circle Suite 260 help@waterlinedata.com Mountain View CA 94040 DATASHEET SMART DATA CATALOG WATERLINE DATA 2017 (650) 946-2104 5