Big Data Big Mess? Ein Versuch einer Positionierung

Similar documents
SAP IQ Software16, Edge Edition. The Affordable High Performance Analytical Database Engine

Domain Services Clusters Centralized Management & Storage for an Oracle Cluster Environment Markus Flechtner

Modern Database Concepts

Strategic Briefing Paper Big Data

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Hadoop An Overview. - Socrates CCDH

Module - 17 Lecture - 23 SQL and NoSQL systems. (Refer Slide Time: 00:04)

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

WELCOME. Unterstützung von Tuning- Maßnahmen mit Hilfe von Capacity Management. DOAG SIG Database

Stages of Data Processing

IBM Cognitive Systems Cognitive Infrastructure for the digital business transformation

CIB Session 12th NoSQL Databases Structures

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam

At-Scale Data Centers & Demand for New Architectures

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

STATE OF MODERN APPLICATIONS IN THE CLOUD

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe

Intro to Neo4j and Graph Databases

From Internet Data Centers to Data Centers in the Cloud

Oracle Database Failover Cluster with Grid Infrastructure 11g Release 2

Introduction to NoSQL by William McKnight

REALTIME WEB APPLICATIONS WITH ORACLE APEX

WHERE TO PUT DATA. or What are we going to do with all this stuff?

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

TECHED USER CONFERENCE MAY 3-4, 2016

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Big Data The end of Data Warehousing?

I am a Data Nerd and so are YOU!

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

The Microsoft Big Data architecture approach

Acquiring Big Data to Realize Business Value

Oracle and Tangosol Acquisition Announcement

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

Scripting OBIEE Is UDML and XML all you need?

How to Use NoSQL in Enterprise Java Applications

When, Where & Why to Use NoSQL?

Chapter 6 VIDEO CASES

CERN s Business Computing

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Big Data - Some Words BIG DATA 8/31/2017. Introduction

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Next Generation DWH Modeling. An overview of DWH modeling methods

Kaladhar Voruganti Senior Technical Director NetApp, CTO Office. 2014, NetApp, All Rights Reserved

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

IBM Storage Solutions & Software Defined Infrastructure

Polyglot Persistence in Today s Data World

Exadata Database Machine Resource Management teile und herrsche!

Embedded Technosolutions

Big Data Analytics using Apache Hadoop and Spark with Scala

Challenges for Data Driven Systems

turning data into dollars

Harnessing the Internet of Things with NoSQL

Database Sharding with Oracle RDBMS

Realtime visitor analysis with Couchbase and Elasticsearch

Safe Harbor Statement

Service discovery in Kubernetes with Fabric8

Graph Databases. Graph Databases. May 2015 Alberto Abelló & Oscar Romero

<Insert Picture Here> Introduction to Big Data Technology

Introduction to Database Services

Designing for Performance: Database Related Worst Practices ITOUG Tech Day, 11 November 2016, Milano (I) Christian Antognini

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

745: Advanced Database Systems

Microsoft Big Data and Hadoop

Efficiency at Scale. Sanjeev Kumar Director of Engineering, Facebook

Progress DataDirect For Business Intelligence And Analytics Vendors

Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Spatial Analytics Built for Big Data Platforms

Object-Relational Mapping Tools let s talk to each other!

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7

Introduction to Graph Databases

Empfehlungen vom BigData Admin

Renovating your storage infrastructure for Cloud era

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

Big Data Architect.

Certified Big Data and Hadoop Course Curriculum

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition

Next-Generation Cloud Platform

DATABASE SCALE WITHOUT LIMITS ON AWS

Big Data Issues for Federal Records Managers

Backup Methods from Practice

SEARCHING BILLIONS OF PRODUCT LOGS IN REAL TIME. Ryan Tabora - Think Big Analytics NoSQL Search Roadshow - June 6, 2013

White Paper Impact of DoD Cloud Strategy and FedRAMP on CSP, Government Agencies and Integrators.

Distributed Databases: SQL vs NoSQL

Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration

Traditional RDBMS Wisdom is All Wrong -- In Three Acts "

NOSQL Databases and Neo4j

High-Level Data Models on RAMCloud

High-Performance Distributed DBMS for Analytics

New Approach to Unstructured Data

TALK 1: CONVINCE YOUR BOSS: CHOOSE THE "RIGHT" DATABASE. Prof. Dr. Stefan Edlich Beuth University of Technology Berlin (App.Sc.)

Mining for insight. Osma Ahvenlampi, CTO, Sulake Implementing business intelligence for Habbo

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Hello, my name is Cara Daly, I am the Product Marketing Manager for Polycom Video Content Management Solutions and today I am going to be reviewing

Transcription:

Big Data Big Mess? Ein Versuch einer Positionierung Autor: Daniel Liebhart (Peter Welkenbach) Datum: 10. Oktober 2012 Ort: DBTA Workshop on Big Data, Cloud Data Management and NoSQL BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 1

Agenda Some Definitions Big Data Big Business? Some technical Remarks Business Cases Ready to go? 2

Definitions: Big Data Big Data: Volume of Data (Terabyte Petabyte Range) Fast Data: Near Real-Time Access All Data: Structured Semi Structured - Unstructured IDC: The bringing together of a vast amount of data from public and private sources, combined with the intuition of business and thought leaders and the speed and affordability of today's computers, is what Big Data is all about Gartner: The term "big data" puts an inordinate focus on the issue of information volume (in every aspect from storage through transform/transport to analysis). Big data is also heavily weighted toward current issues and can lead to short-sighted decisions that will hamper the enterprise's information architecture as IT leaders try to expand and change it to meet changing business needs. 3

Definitions: NoSQL - NewSQL NoSQL: Means Not only SQL Consists of different Non-Relational Storage & Data Access Technologies like GraphDB, Big Tables, Key Value, Document, Grid and Distributed Cache Examples: Neo4J (GraphDB), HBase (Big Tables), Oracle NoSQL (Key Value), MongoDB (Document), Oracle Coherence (Grid), memcached (Distributed Cache) NewSQL: New Breed of RDBMS Different that existing RDDMS Technolgies, optimized for fast & distributed access Scale Out Architectures usually In-Memory Technologies Examples: Oracle TimesTen, VoltDB 4

Big Data Big Business - Drivers? Future Issues drives Big Data: Private / Business: No Separation any more (60% employees wishes to work at home) Tablets / Mobile: More and More (5 billion mobile phones today) Social Networks: Enterprise Agents Social Networks: Geo-Information Analysts Expectations for 2015: 25% more productivity 60% more ITspending / employee Data Storage Issue: Data Analysis on the base of replication (DWH) is not anymore possible (the amount of data is too big) 5

Big Data Big Business? Forbes: Big Data Is Big Market & Big Business - $50 Billion Market by 2017 ($5 Billion 2012 ;-)) IDC: Big Data as a Part of Future Third Platform : 1. Internal IT-Services 2. External SaaS Services (Cloud Mobile - Internet of Things ) 3. Big Data Processing But: Is Big Data like Cloud Computing? Big Hype No Business? Or: Market driven by Business Needs or Drives the Market the Business? Last Question: What exactly is big? 6

Technical Backgroud: Storage

Technical Backgroud: NoSQL Data Modeling NoSQL data modeling often starts from the application-specific queries as opposed to relational modeling: Relational modeling is typically - driven by structure of available data, - the main design theme is What answers do I have? - Answere oriented - More generic NoSQL data modeling is typically - driven by application-specific access patterns, i.e. types of queries to be supported. - The main design theme is What questions do I have? - Question oriented - More application specific 8

Technical Backgroud: Big Data Processing Example Oracle RAC 9

Technical Backgroud: Big Data Processing Example Alternative 1: MySQL Cluster 10

Technical Backgroud: Big Data Processing Example Alternative 2: NoSQL / NewSQL 11

Technical Backgroud: Storage Layers 12

Technical Backgroud: Products Today 13

Business Cases: Classics 1 Scientific Data Processing 14

Business Cases: Classics 1 Scientific Data Processing Test Usage Data Rate Data Volume ATLAS Higgs-Bosons 100 Mb / sec 1 Petabyte / year CMS Higgs-Bosons 100 Mb / sec 1 Petabyte / year Alice LHCb Big-Bang Simulation Will be explained later on 1.5 Gb / sec 1 Petabyte / month 400 TB / year 15

Business Cases: Social Media Facebook: Facebook stores much of the data on its massive Hadoop cluster, which has grown exponentially in recent years. Today the cluster holds 30 petabytes of data or, as Facebook puts it, about 3,000 times more information than is stored by the Library of Congress. Key Issue for Companies: Integration of Social Media for Market Research, Product Placement, Recommendations 16

Business Cases: Advanced Analytics Walmart: Largest Commercial DWH: ca. 500 TB Key Issue: Retail Analytics Combines Social Media Data, Geoinformations and Demographic Information for Shop Plannings Log Analysis Fraud Detection Call Center Call Storage Big Data Analytics: As a market 17

Business Cases: War Rooms / Telco / Traffic Analysis War Rooms: Finding Natural Ressources without digging in the deep Telco / Energy: Network Optimization Public & Private Transport & Logistics: Traffic Analysis & Routing Optimization 18

Business Cases - Future Trend: Structured & Non Structured Data Application Lifecycle: 10 15 Years New Application Lifecycle: 10 15 Years New Application Data Lifecycle: 20-30 Years Restructured Data Applications: Lifecycle (Build Run Dismiss): 10-15 years Data: Lifecycle (Build Run Dismiss): 20-30 years Data ist structured (ca. 20%) or non structured (ca. 80 %) Consequence: Separation of systems & management prinicples 19

Business Cases - Future Trend: Structured & Non Structured Data Consequence: Big Data as a solution to an open problem: How to access Non Structured Data? 20

Business Cases - Future Trend: Third Plattform 21

Ready to go?: Considerations for Enterprises What to watch: How Big ist my Big Data? (Investment, Standards) What to watch: Maturity of products / providers (ask you consultant) How to watch: Focus on Usage Pattern How to watch: Check the important products When to try: Check your Business Case Consumer Business? / Geoinformation / Chemical Processchains (Graph Analytics)? / RFID / NFC? When to try: Is Scaling Out your problem? 22

Ready to go?: Considerations for IT Technical Use Case: Universal Access (RDBMS) versus Predefined Access (NoSQL) Technical Use Case: Near Real-Time really needed? Technical Use Case: Combine Analytics of Structured (Yeah!) & Unstructured Data (Text, Video, Picture) Watch out for upcoming standards! Combine Solutions! 23

Thanks for your attention 24

Additional Slides Look Ma No Tables! 25

Technical Backgroud: Look Ma No Tables! 26

Technical Backgroud: Oh But here they are Composite key is a very generic technique, but it is extremely beneficial when a store with ordered keys is used. Composite keys in conjunction with secondary sorting allows to build kind of multidimensional index which is fundamentally similar to the Dimensionality Reduction technique 27

Ready to go?: Some Considerations There is a dark side to most of the current NoSQL databases. - They talk about performance, about how easy schemaless databases are to use. - About nice APIs. - They are mostly developers and - not operation and system administrators. - No-one asks those. But it s there where rubber hits the road. The three problems no-one talks about almost none, are: - ad hoc data fixing either no query language available or no skills - ad hoc reporting either no query language available or no in-house skills - data export sometimes no API way to access all data Blog answeres: a NoSQL store doesn t imply no RDBMS at all. Good discussion of running both types of DBs in one system: http://johnpwood.net/2009/09/29/using-multiple-database-models-in-a-single-application/ 28