The Root Cause of Unstructured Data Problems is Not What You Think

Similar documents
Microsoft SharePoint Server 2013 Plan, Configure & Manage

RFP/RFI Questions for Managed Security Services. Sample MSSP RFP Template

Welcome Back! Without further delay, let s get started! First Things First. If you haven t done it already, download Turbo Lister from ebay.

2012 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, Excel, Lync, Outlook, SharePoint, Silverlight, SQL Server, Windows,

2 The IBM Data Governance Unified Process

Traditional Security Solutions Have Reached Their Limit

OPTIMISATION DE VOTRE SAUVEGARDE. Daniel De Prezzo Technical Sales & Services Presales Manager Symantec

"Charting the Course... Certified Information Systems Auditor (CISA) Course Summary

BUILDING CYBERSECURITY CAPABILITY, MATURITY, RESILIENCE

Data Governance in Mass upload processes Case KONE. Finnish Winshuttle User Group , Helsinki

Visual Studio Subscriptions Administration Guide

Sparta Systems TrackWise Digital Solution

LeakDAS Version 4 The Complete Guide

Understanding Virtual System Data Protection

Investing in a Better Storage Environment:

Upgrading to Parallels Virtuozzo Containers 4.0 for Windows. Contents. About This Document

Optimizing and Managing File Storage in Windows Environments

Deploy Enhancements from Sandboxes

Cloud Computing. An introduction using MS Office 365, Google, Amazon, & Dropbox.

Website/Blog Admin Using WordPress

NetBackup 7.6 Replication Director A Hands On Experience

Executive Summary SOLE SOURCE JUSTIFICATION. Microsoft Integration

Information Lifecycle Management for Business Data. An Oracle White Paper September 2005

The ITIL v.3. Foundation Examination

NTP Software VFM Administration Web Site

Mapping Your Requirements to the NIST Cybersecurity Framework. Industry Perspective

Advanced Solutions of Microsoft SharePoint Server 2013 Course Contact Hours

Data Governance: Data Usage Labeling and Enforcement in Adobe Cloud Platform

Advanced Solutions of Microsoft SharePoint Server 2013

Collector and Dealer Software - CAD 3.1

Getting Started with Cybersecurity

Deliver Data Protection Services that Boost Revenues and Margins

NTP Software VFM. Administration Web Site for NetAppS3. User Manual. Version 5.1

GOOGLE APPS. If you have difficulty using this program, please contact IT Personnel by phone at

6 Tips to Help You Improve Configuration Management. by Stuart Rance

Certificate-based authentication for data security

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

Backup and Restore Strategies

How to Write an MSSP RFP. White Paper

The InfoLibrarian Metadata Appliance Automated Cataloging System for your IT infrastructure.

DISASTER RECOVERY TESTING, YOUR EXCUSES, AND HOW TO WIN

10 Hidden IT Risks That Might Threaten Your Business

Certified Information Systems Auditor (CISA)

NTP Software VFM. Administration Web Site for Atmos. User Manual. Version 5.1

WordPress Tutorial for Beginners with Step by Step PDF by Stratosphere Digital

locuz.com SOC Services

MODERN ARCHIVING WITH ELASTIC CLOUD STORAGE (ECS)

NTP Software VFM. Administration Web Site for EMC Atmos User Manual. Version 6.1

The Key to Disaster Recovery

FilesAnywhere Features List

How Secured2 Uses Beyond Encryption Security to Protect Your Data

Automating Information Lifecycle Management with

NTP Software VFM Administration Web Site For Microsoft Azure

The Latest EMC s announcements

Advanced Solutions of Microsoft SharePoint 2013

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

Getting Started Guide

How to Troubleshoot Databases and Exadata Using Oracle Log Analytics

COPYRIGHTED MATERIAL. A Framework for Enterprise Applications. Problem

Altius IT Policy Collection

One of the fundamental kinds of websites that SharePoint 2010 allows

ISO27001 Preparing your business with Snare

GDPR Processor Security Controls. GDPR Toolkit Version 1 Datagator Ltd

Data Governance Data Usage Labeling and Enforcement in Adobe Experience Platform

Micro Focus Partner Program. For Resellers

Welcome To Account Manager 2.0

WHITE PAPER: TOP 10 CAPABILITIES TO LOOK FOR IN A DATA CATALOG

SPANNING BACKUP. for Salesforce. Installation Guide

Contingency Planning and Disaster Recovery

Secure Services 2018

Cancer Waiting Times. Getting Started with Beta Testing. Beta Testing period: 01 February May Copyright 2018 NHS Digital

Are Network Blind Spots Killing Your MSP?

The Trail of Electrons

White Paper. How to Write an MSSP RFP

CPM. Quick Start Guide V2.4.0

Parallels Virtuozzo Containers 4.6 for Windows

This presentation is on issues that span most every digitization project.

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies:

Protecting your Data in the Cloud. Cyber Security Awareness Month Seminar Series

Identify and cluster touchpoints in several ways Identify risks and initiatives associated to touchpoints

Professional Editions Setup Guide

Technical Brief Distributed Trusted Computing

HP Storage Software Solutions

Section 2. Sending s

Introduction...4. Purpose...4 Scope...4 Manitoba ehealth Incident Management...4 Icons...4

List Building Starter Course. Lesson 2. Writing Your Campaign. Sean Mize

Who We Are! Natalie Timpone

Evolved Backup and Recovery for the Enterprise

OYSTOR USER ADMIN GUIDE

Network Rail Brand Hub USER GUIDE

BUSINESS CONTINUITY: THE PROFIT SCENARIO

TRUSTED IT: REDEFINE SOCIAL, MOBILE & CLOUD INFRASTRUCTURE. John McDonald

Why Information Architecture is Vital for Effective Information Management. J. Kevin Parker, CIP, INFO CEO & Principal Architect at Kwestix

MANAGING ENDPOINTS WITH DEFENSE- IN-DEPTH

For Volunteers An Elvanto Guide

HPE ALM Standardization as a Precursor for Data Warehousing March 7, 2017

EMC Documentum Quality and Manufacturing

Introduction...5. Chapter 1. Installing System Installing Server and ELMA Designer... 7

Planning and Administering SharePoint 2016

Firmware Updates for Internet of Things Devices

Transcription:

The Root Cause of Unstructured Data Problems is Not What You Think PRESENTATION TITLE GOES HERE Bruce Thompson, CEO Action Information Systems www.expeditefile.com

What is this presentation all about? Today s storage model has no idea what it is storing. Expedite implements a new storage model that knows what business information is being stored. 2

So What? Who s Suffering? Every aspect of unstructured data management is limited by the old storage model: Backup Industry File Servers HSM, ILM, Tier 2, NearLine, LTFS, etc. Hit wall first (>10 years ago) Storage model is not wrong, just very limited. It s time for a new model. 3

Why Know What Is Stored? Solves many storage problems Enables currently impossible features Storage management is MUCH easier Opens new markets New revenue opportunities Fun 4

WARNING! Changes the way we think about storage Can make storage people uncomfortable Users may know more about this than we do Change can make investments less relevant Any use of the word process is Kryptonite! Storage people find it easy to dismiss 5

Ways to Dismiss the New Model That s Impossible! It s not my job, strategy, scope, etc. We don t / can t / won t do processes Every process is completely unique anyway Requires massive OS and storage changes Demands massive professional services No one is asking for it Customers won t understand it 6

What is Unstructured Data? Pile-of-Files? Ugly bags of mostly bits? Who gets to define it? The ones trying to run their companies with unstructured data New community of uses Information Owners Responsible for data Information Stewards Manage the data How do THEY define Unstructured Data? 7

When Asked, what do they say? Policies Contracts Easements Patents Professional Agreements Tax Records Employee Agreements Incident Reports Change Requests Time Sheets Expense Reports Purchase Orders Engineering Reports BoD Minutes Flight Records Explosive permits 8

Information Asset What business people work with, communicate with, get evaluated against, and paid to control Atomic unit of business information Defined as a set of files, tracking data, processes, states, rules, people, etc., that collectively, are meaningful to their organization. All have lifecycles Can include thousands of files 9

Information Asset Example: Contract Possible Contract Components The Word document The client name The type of contract Template that was used to create the contract Previous versions of the contract List of people who need to approve it The costing spreadsheet used to create sections List of people who have not yet approved it Proof that people actually approved it A cryptographic signature to detect tampering A copy to put up on the website Copies of important emails from the customer References to previous contracts Validation script to run against the contract The PDF version suitable to send to the client Client contact information The value of the contract The audit log of everything that happened to it Rejected versions The sales rep who created it Scanned signature page List of people to be notified when approved Proof that the customer received the copy A mirrored copy for protection Login of users who can access it via the website Any related photos, sketches, or drawings Customer account number How long people are given to approve or reject it 10

How do they control them? When asked, users very quickly start to discuss collections of information assets, all with the same state. We call these a Container. Where we can automatically apply storage management functions! 11

The New Business Stack Information Asset Portfolio Process Container Leverage As-Is Information Asset Directories, Files, Objects New Storage Model Existing Storage Model 12

Requirements Difficult to Get? Talk to business people in THEIR terms, not storage terms. I haven t seen a process that couldn t be defined by an information owner in less than 60 seconds. (usually much quicker) The challenge is to be able to do something with their requirements. 13

WHAT CAN STORAGE DO WITH THIS NEW MODEL? 14

Here is a simple quiz Want to implement a storage function on files A: Requires a way to set a bit on the file. B: Requires the bit to be stored. C: Implement the storage function. Q: Which one s harder? 15

Problem with Storage Setting the bit is orders of magnitude harder! Don t know which files to set the bit. Don t know when to set the bit. Don t know when NOT to set the bit. This new model of unstructured data provides this missing intelligence to answer these questions. 16

Users View of Storage Policies 17

Example of Bit Setting Difficulty Ability to set a file to read only vs. read/write. Been there since the introduction file systems How are file servers ACTUALLY configured? Full control! Anyone can do anything on any file at any time anonymously. Tragically Trivial To Trash! Upset Customer 18

Can Backup Help? What can backup do when a file is deleted or corrupted? It can t do anything. Must assume its valid. Not provided enough information to know the difference between a change and a corruption. Forced to leave the decision to a human to notice the problem and manually run restore. Roll the media? Lose the only good copy? 19

What the Users Want When the asset is created or changed, protect it and back it up immediately. It is important. If someone deletes it by mistake or corrupts it, restore it automatically and immediately. But, allow changes to it Done via process controls What happens if you deliver this? Users trust the data Don t make copies Most Efficient Dedupe! 20

New Requirements This user community will have a new set of requirements for the storage industry. White paper outlining >100 differences. Revolve around their definition of unstructured data. Our software, Expedite, provides the bridge between their requirements (market) and today s storage implementations. 21

Levels of Asset Control 1. Files and Directories 2. First Stage 3. Identification 4. Categorized 5. Process 6. Tailored Process 7. Custom Process 8. Purpose built application Increasing levels of Process controls 22

EXPEDITE 23

Why Is Expedite Different? B.A.S.I.C. B. Business Users. (Owners and Stewards) A. Assets Unit of information is the asset. S. Storage Storage aware. I. Integrated intelligence. C. Cloud Handles all the different topologies and access methods needed for today s business. 24

Basic Architecture Server Audit Watcher File System Rules Engine Behavior Pool CIFS P2P P2P Client Mapped Drive Letter Business Process Commands xml Format xml X Right Click Popup Dialogs SQL DB Rules Config HTTPD Browser 25

Basic Controls First Stage Container How to setup a First Stage container? Answer 3 questions: What to call it Where it put it in the directory structure Where to put it in the portfolio map (optional) who has access to it 26

First Stage Features Create From templates Import Single files Upload Dir Trees Metadata Extendable Attachments emails, multimedia, etc. Permissions Protect from modification Mirroring Auto recovery Crypto-Signatures Corruption detection Check-in/Checkout Modification process Versioning History Logging Forensics Search Metadata, names, content Archive Move to new location. Obsolete Controlled deletion process 27

Setup Enough Containers Result is the Information Asset Portfolio Relationships among the Information Assets What users want to use to navigate information Don t want enterprise search Now can set bits at a very high level and keep them set as data is created and moved. 28

What can you do with the map? Navigate Go To the data, not search for it. Setup automated storage management Documentation (processes, owners, stewards ) Security Access methods Audit Business Intelligence and analytics Can tell what is not important 29

Who else needs the portfolio? ediscovery Classification Business Process Management Backup Active Archive Compliance Data Governance Master Data Management Information Assurance Virtualization Identity Management File Sharing Data Loss Prevention Cloud Storage Big Data Security Disaster Recovery 30

Challenge to SNIA Who is going to ultimately control the portfolio? Whoever controls the portfolio, will control the industries. Will the Storage Industry step up and drive this or will another industry take the lead? 31

Dealing with Dark Data No Silver Bullet Users want: Determine what it is, what state it is in, Find what is important Find what is valid Really want these converted to information assets! Why is that so difficult? 32

Content to Context Conversion In the general case, it is not possible to recreate the context of a file through analysis of its content. Plenty of evidence over the past several decades that this is true. Asking for the creation of information that is simply not stored in the computer. Big Data s assertion that pouring all unstructured data into Hadoop may not yield useful results. 33

Dark Data Attack Strategy Create the asset portfolio. Have places to put the existing files. Have some basic process controls moving forward. Move existing data up the process value chain as appropriate for the type of information. Can require signification human interaction. May not be worthwhile or completely possible. Decide if business decisions are still pending. 34

ILM Promise of slow, huge, cheap devices? 60-80% of data not accessed for a year? Move the old to cheap storage! Significant potential for cost savings. Has been attempted many times. (6 for me!) All have been file system based (like LTFS). All have failed. (NOT a shared service for all) People have been doing this with paper forever! 35

ILM Challenges and Limitations Users and operating systems can t handle large delays. Something or someone must orchestrate device access to prevent massive performance problems. Ends up limited to a single application, sometimes a single person. What is needed is a common, integrated, shared, service available to all. 36

Required for ILM Support Must know what data can be moved to the library and when. Can t guess Can t automatically restore a file on access. Must provide an interface so users know before they are going to have to wait. Let users decide how to proceed. Users have to know before hand that if they retrieve data, it is what they are looking for. This is the hardest one of all. 37

What If We Did ILM Correctly? What if we were able to utilize huge, slow, cheap storage as a general purpose sharable service integrated into our processes? Convert piling problem to a pipeline problem. That is the only real winning strategy to attack the tsunami of unstructured data. Can tape do this? 38

Conclusion Industry is limited by the storage model Every aspect of storage management can be improved, expanded, and integrated Rare technology that can amplify the value of existing infrastructure investments The Storage Industry should control the portfolio A new set of users to empower Aligns business with IT Looking for progressive partners 39

Contact Info Bruce Thompson, CEO Action Information Systems, Inc www.expeditefile.com See For Storage Professionals for details. brucet@expeditefile.com 303-912-3172 40