Indiana Oracle Users Group January meeting January 27 th, Big Data
|
|
- Roberta Boone
- 5 years ago
- Views:
Transcription
1 Indiana Oracle Users Group January meeting January 27 th, 2017 Big Data
2 Agenda Big Data Hype RDBMS Vs. Big Data Data explosion Why do we need big data? How to approach? Big data process cycle Machine Learning Oracle Big Data Solutions Big Data and Cloud Use cases Expanding DBA/Sys Admin/Developer role into Big Data space 2
3 About me Ashokkumar Sivasankaran (Ashok) Database Architect, Ensono Twenty eight years in IT services OCE RAC Expert & OCP Database Administrator 7.3 to 11g AWS Solution Architect - Associate ITIL V3 Foundation Certified Member IOUG, COUG and INOUG 3
4 Snapshot of Ensono Client-first philosophy Hybrid IT services mainframe to cloud and everything in between Expertise managing missioncritical infrastructure Agile development to drive innovation Global capabilities Revenue of $265 million 4
5 Ensono s Presence Around the World Headquarters Chicago, IL Data Centers Downers Grove, IL Elk Grove Village, IL Conway, AR Little Rock, AR London, UK Slough, UK Delivery Centers Onshore IL and AR Offshore UK, Poland, Germany, India 5
6 The Power in Numbers NEARLY 20,000 SERVERS UNDER MANAGEMENT 17,700 DATABASES MANAGED 80,000 MAINFRAME MIPS UNDER MANAGEMENT 10 OVER 1.4 MILLION PETABYTES OF STORAGE MANAGED TECHNOLOGY COMPONENTS MONITORED DATA CENTER AVAILABILITY 100% ACHIEVED OVER LAST 5 YEARS 100% SECURITY AUDITS PASSED BY CLIENTS 30 GIGABITS PER SECOND OF INTERNET BANDWIDTH CAPACITY 6
7 Exceeding Expectations for Over 47 Years Ranked #1 in Customer Satisfaction by The Black Book of Outsourcing (2010) Received Uptime Institute Award for 100% Data Center availability ( ) Recognized by Sunday Times as Top 100 Companies to Work For ( ) Launched cloud platform(2007) Ranked in Top 3 for Technology Innovation by InformationWeek (2011) Ranked in Gartner Magic Quadrant, Data Center Outsourcing, North America ( ) Ranked in Gartner Magic Quadrant, Cloud Managed Hosting, Europe ( ) 7
8 Big Data Hype Source: Gartner, 2013 Hype Cycle for Emerging Technologies 8
9 Big Data Hype 9
10 Big Data Hype Last 15 years we heard about Big data, Hadoop, Data analytics, Data Visualization, Data Science Is it still in hype or matured? Answer is Yes and No Many companies are using Big data extensively and growing their business Social media, Non-profit organizations, Sports, Research, Medical, Space, Political campaigns Some are in planning, evaluation stages Many are still new to Big Data So, potential growth opportunities 10
11 Timeline 2003 Google File system paper released ( Hadoop is born 2006 Yahoo deploys 300 nodes Hadoop cluster 2007 Yahoo runs two clusters of 1000 nodes 2008 Hadoop wins Terabyte sort benchmark 2009 Hadoop sorts a petabyte 2010 Yahoo 4,000 nodes handles 70 petabytes 11
12 RDBMS Vs Big Data RDBMS Relational Database Management System(RDBMS), invented by E. F. Codd at IBM in Structured data stored in rows and columns in normalized form Improves data access and reduces data duplication. i.e., employee master table and monthly salary table Major RDBMS are: Oracle, Sqlserver, MySql, DB2 and Sybase It will continue to manage OLTP needs Big data Superset of all types of data including data managed in RDBMS All kinds of digital data (rdbms data, text, video, images, sensor data) Mainly unstructured Very large volume Big data processing is an alternate to expensive supercomputing model to some extent 12
13 Big Data 13
14 Data Explosion 1 Bit = Binary Digit 8 Bits = 1 Byte 1024 Bytes = 1 Kilobyte 1024 Kilobytes = 1 Megabyte 1024 Megabytes = 1 Gigabyte 1024 Gigabytes = 1 Terabyte 1024 Terabytes = 1 Petabyte 1024 Petabytes = 1 Exabyte 1024 Exabytes = 1 Zettabyte 1024 Zettabytes = 1 Yottabyte 1024 Yottabytes = 1 Brontobyte 1024 Brontobytes = 1 Geopbyte Source: Infragistics 14
15 Data explosion reasons Social Networks Mobile IOT devices Sensors Clickstream Research Existing data growth, capturing detailed attributes 15
16 Why do we need Big Data? Better understanding about customer preference, market trends Predict, plan and take proactive actions Converting challenges and changes to growth opportunities Thinking ahead of the curve, product issue postings at social network(i.e., about a new cell phone) Out of box thinking from pattern, correlation, outliers Nowcasting vs Forecasting Nowcasting methods based on social media content (Google, Twitter) have been developed to estimate hidden quantities such as the 'mood' of a population or the presence of a flu epidemic. 16
17 Big Data Process Cycle 17
18 Big Data Landscape Source: matturck.com 18
19 How to approach Big Data Project? Start with questions What we want to know? How do we want it? What will be the benefit? Develop Big Data strategy Connect Stakeholders Identify data, data source, extracting, storing, analyzing and visualizing processes Identify right tools Architect a pilot project to meet your requirement Don t allow scope creep Validate results at each stage 19
20 Hadoop Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are commonplace and thus should be automatically handled in software by the framework. Source: Wikipedia Hadoop is for offline batch processing and not for online and transaction processing(oltp) Hadoop consists of two primary components: HDFS Distributed file system for storage MapReduce Parallel processing framework 20
21 Hadoop Architecture 21
22 NoSql Databases NoSql Non Sql, Non Relational, Not only Sql databases Used to store Key-value type data, documents, graph, wide column data MongoDb (document, Native JSON support,..) Oracle NoSql AWS DnyamoDB Redis (Stores all data in memory, good for real-time analytics) Cassendra (developed at Facebook, key-value store column based) CouchDB ( JSON access over HTTP, best for web access) Hbase (integrated with Hadoop architecture, Tabular/columnar storage, non-relational, distributed database, best for complex computing jobs) 22
23 Hadoop Distributions, other tools and Software Hadoop Distributions o Amazon Web Services(EMR Elastic Map Reduce) o Cloudera o Hortonworks o MapR Other tools ZooKeeper Coordination framework Hive SQL like interface for querying Pig High level scripting language for batch processing and ETL 23
24 Hadoop Distributions, other tools and Software Flume Distributed service for collecting, aggregating, and moving large data Sqoop Bulk data transfer Mahout Data Mining utility using machine learning Hue Web console for Hadoop Oozie Management workflow and job scheduling Storm Distributed computation framework Kaffka open source message broker Impala Cloudera Impala is massively parallel processing SQL query engine for data stored in Hadoop cluster YARN Resource management system 24
25 Implementation Select software according to data type and processing requirement Use commodity low cost servers or virtual servers or cloud infrastructure Dynamic distribution and parallelism Efficient Map-Reduce, aggregation Parallel, in-memory, minimize network communication Fault tolerance and automatic recovery Moving computation to the data(moving data is expensive) Data processed in chunks in multiple nodes(distributed computing) High availability and scalability 25
26 BIG data workflow Support Area Resource Skills required Developing and Architecting Solution Big Data Architect Understanding Business need for Big data solution, data source, moving data, processing methods, infrastructure need and developing cost effective flexible solution Hardware, Storage and OS support Hardware, Storage, OS and VM Admin Linux admin experience to build physical or virtual linux servers to create Big data cluster Network and Security Network and Security Admin Providing secured network for the Linux servers Resource and Availability Monitoring Monitoring Admin Monitor servers and key processes Backup and restore Backup Admin Backup and restore identified file systems Big data admin(hadoop or other software) Scheduling Data collection, processing and sending data jobs Hadoop/No Sql(MongoDB/CouchDB/..) Admin Job scheduler/middleware Admin Administering Big data cluster Operation, Development and Middleware support Data collection and processing Developer Strong Java /Python programming and Shell scripting experience Analytics and Data visualization Data Analyst/ Data scientist Analyzing, pattern discovery and predicting 26
27 Data Science 27
28 Machine Learning and Deep Learning Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data. Source: Techtarget Supervised learning (classification/labeling i.e., spam filter) Unsupervised learning (clustering/association i.e., pattern identification) Deep Learning is a branch of machine learning. Many Deep Leaning algorithms are applied to unsupervised learning Neural networks, Tensorflow, MXNet 28
29 Machine Learning Source: Gartner, 2016 Hype Cycle for Emerging Technologies Note: Machine Learning is at the peak, IOT, Big Data disappeared 29
30 Data Analytics and Visualization Data analytics is the science of examining data using quantitative and qualitative techniques, finding relationship, pattern, outliers and predict. Data visualization is the presentation of data in pictorial format. It helps to understand the data, pattern, impacts and make decision. 30
31 R programming R is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. > table(awr_data$wait_class) Commit System I/O User I/O > pie(table(awr_data$wait_class)) 31
32 Oracle Big Data Solutions Oracle Big Data Cloud Service ( Oracle Big Data Appliance ( Oracle NoSql Database ( Oracle Big Data Sql ( Oracle Advanced Analytics ( Oracle Data integrator for Big data ( Oracle Data Mining ( Oracle provides Oracle Big Data Lite virtual machine for evaluation and educational purposes ( 32
33 Big Data and Cloud Big data processing handles large volume and complex data Data flow and processing time may change rapidly We want to spend less on IT projects Easy to add or remove resources in cloud on demand Easy to scale up or down resources in cloud according to data flow Cloud offers better cost savings solutions We can get comprehensive big data processing clusters and visualization services in cloud 33
34 Big Data Use Cases Sentiment Analysis 360 degree of customer view Internet of Things (IoT) Real Time Analytics Fraud deduction Clickstream Analysis Data warehouse optimization Case studies: 34
35 Expanding DBA/Sys Admin/Developer role into Big Data space You know about data, data access, data move, security, application processing, constraints You know about compute resources - cpu, memory, i/o, storage, network Now expand your role by Learning Hadoop Architecture Learning Python or Java or R programming Understand Cloud Services (ideal infrastructure for Big data processing) Understand Big data process cycle Identify preferred area based on your interest and strength Online material, Youtube, MOOC courses(edx, Coursera), Vendors resources are available to learn Practice using Cloud resources, Virtual Boxes, Online lab 35
36 Questions? 36
37 Thank you! 37
Hadoop An Overview. - Socrates CCDH
Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected
More informationBig Data with Hadoop Ecosystem
Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process
More informationHadoop. Introduction / Overview
Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures
More informationBig Data Hadoop Stack
Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware
More informationThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem Zohar Elkayam www.realdbamagic.com Twitter: @realmgic Who am I? Zohar Elkayam, CTO at Brillix Programmer, DBA, team leader, database trainer,
More informationBig Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours
Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals
More informationThe Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou
The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component
More informationConfiguring and Deploying Hadoop Cluster Deployment Templates
Configuring and Deploying Hadoop Cluster Deployment Templates This chapter contains the following sections: Hadoop Cluster Profile Templates, on page 1 Creating a Hadoop Cluster Profile Template, on page
More informationHadoop, Yarn and Beyond
Hadoop, Yarn and Beyond 1 B. R A M A M U R T H Y Overview We learned about Hadoop1.x or the core. Just like Java evolved, Java core, Java 1.X, Java 2.. So on, software and systems evolve, naturally.. Lets
More informationProcessing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.
Processing Unstructured Data Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. http://dinesql.com / Dinesh Priyankara @dinesh_priya Founder/Principal Architect dinesql Pvt Ltd. Microsoft Most
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationData Science and Open Source Software. Iraklis Varlamis Assistant Professor Harokopio University of Athens
Data Science and Open Source Software Iraklis Varlamis Assistant Professor Harokopio University of Athens varlamis@hua.gr What is data science? 2 Why data science is important? More data (volume, variety,...)
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationSpagoBI and Talend jointly support Big Data scenarios
SpagoBI and Talend jointly support Big Data scenarios Monica Franceschini - SpagoBI Architect SpagoBI Competency Center - Engineering Group Big-data Agenda Intro & definitions Layers Talend & SpagoBI SpagoBI
More informationNew Approaches to Big Data Processing and Analytics
New Approaches to Big Data Processing and Analytics Contributing authors: David Floyer, David Vellante Original publication date: February 12, 2013 There are number of approaches to processing and analyzing
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationWhat is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?
Simple to start What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? What is the maximum download speed you get? Simple computation
More informationInnovatus Technologies
HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String
More informationSpatial Analytics Built for Big Data Platforms
Spatial Analytics Built for Big Platforms Roberto Infante Software Development Manager, Spatial and Graph 1 Copyright 2011, Oracle and/or its affiliates. All rights Global Digital Growth The Internet of
More informationBig Data Hadoop Course Content
Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux
More informationBlended Learning Outline: Cloudera Data Analyst Training (171219a)
Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationHow Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,
How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationOverview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::
Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationOracle Big Data Fundamentals Ed 2
Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies
More informationIntroduction to BigData, Hadoop:-
Introduction to BigData, Hadoop:- Big Data Introduction: Hadoop Introduction What is Hadoop? Why Hadoop? Hadoop History. Different types of Components in Hadoop? HDFS, MapReduce, PIG, Hive, SQOOP, HBASE,
More informationSQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism
Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationGain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.
Gain Insights From Unstructured Data Using Pivotal HD 1 Traditional Enterprise Analytics Process 2 The Fundamental Paradigm Shift Internet age and exploding data growth Enterprises leverage new data sources
More informationDepartment of Information Technology, St. Joseph s College (Autonomous), Trichy, TamilNadu, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 5 ISSN : 2456-3307 A Survey on Big Data and Hadoop Ecosystem Components
More informationOracle Big Data Fundamentals Ed 1
Oracle University Contact Us: +0097143909050 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data
More informationBig Data Analytics using Apache Hadoop and Spark with Scala
Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info
We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423
More informationHadoop & Big Data Analytics Complete Practical & Real-time Training
An ISO Certified Training Institute A Unit of Sequelgate Innovative Technologies Pvt. Ltd. www.sqlschool.com Hadoop & Big Data Analytics Complete Practical & Real-time Training Mode : Instructor Led LIVE
More informationBig Data Programming: an Introduction. Spring 2015, X. Zhang Fordham Univ.
Big Data Programming: an Introduction Spring 2015, X. Zhang Fordham Univ. Outline What the course is about? scope Introduction to big data programming Opportunity and challenge of big data Origin of Hadoop
More informationTop 25 Big Data Interview Questions And Answers
Top 25 Big Data Interview Questions And Answers By: Neeru Jain - Big Data The era of big data has just begun. With more companies inclined towards big data to run their operations, the demand for talent
More informationBig Data on AWS. Peter-Mark Verwoerd Solutions Architect
Big Data on AWS Peter-Mark Verwoerd Solutions Architect What to get out of this talk Non-technical: Big Data processing stages: ingest, store, process, visualize Hot vs. Cold data Low latency processing
More informationThe age of Big Data Big Data for Oracle Database Professionals
The age of Big Data Big Data for Oracle Database Professionals Oracle OpenWorld 2017 #OOW17 SessionID: SUN5698 Tom S. Reddy tom.reddy@datareddy.com About the Speaker COLLABORATE & OpenWorld Speaker IOUG
More informationSpotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data
Spotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data THE RISE OF BIG DATA BIG DATA: A REVOLUTION IN ACCESS Large-scale data sets are nothing
More informationSTATE OF MODERN APPLICATIONS IN THE CLOUD
STATE OF MODERN APPLICATIONS IN THE CLOUD 2017 Introduction The Rise of Modern Applications What is the Modern Application? Today s leading enterprises are striving to deliver high performance, highly
More informationHadoop Online Training
Hadoop Online Training IQ training facility offers Hadoop Online Training. Our Hadoop trainers come with vast work experience and teaching skills. Our Hadoop training online is regarded as the one of the
More informationScalable Web Programming. CS193S - Jan Jannink - 2/25/10
Scalable Web Programming CS193S - Jan Jannink - 2/25/10 Weekly Syllabus 1.Scalability: (Jan.) 2.Agile Practices 3.Ecology/Mashups 4.Browser/Client 7.Analytics 8.Cloud/Map-Reduce 9.Published APIs: (Mar.)*
More informationCERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)
CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationHadoop Development Introduction
Hadoop Development Introduction What is Bigdata? Evolution of Bigdata Types of Data and their Significance Need for Bigdata Analytics Why Bigdata with Hadoop? History of Hadoop Why Hadoop is in demand
More informationBig Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big Data on AWS Big Data Agility and Performance Delivered in the Cloud 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Technologies and techniques for working productively
More informationOracle Database High Availability and Performance Scalability
Indiana Oracle Users Group Technology Day May 4 th, 2016 Oracle Database High Availability and Performance Scalability Ashokkumar Sivasankaran About me Ashokkumar Sivasankaran (Ashok) Client Technical
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationBig Data and Cloud Computing
Big Data and Cloud Computing Presented at Faculty of Computer Science University of Murcia Presenter: Muhammad Fahim, PhD Department of Computer Eng. Istanbul S. Zaim University, Istanbul, Turkey About
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationPercona Live September 21-23, 2015 Mövenpick Hotel Amsterdam
Percona Live 2015 September 21-23, 2015 Mövenpick Hotel Amsterdam MongoDB, Elastic, and Hadoop: The What, When, and How Kimberly Wilkins Principal Engineer/Database Denizen ObjectRocket/Rackspace kimberly@objectrocket.com
More informationIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Large-scale Computation Traditional solutions for computing large
More informationBIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG
BIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG Prof R.Angelin Preethi #1 and Prof J.Elavarasi *2 # Department of Computer Science, Kamban College of Arts and Science for Women, TamilNadu,
More informationAWS Serverless Architecture Think Big
MAKING BIG DATA COME ALIVE AWS Serverless Architecture Think Big Garrett Holbrook, Data Engineer Feb 1 st, 2017 Agenda What is Think Big? Example Project Walkthrough AWS Serverless 2 Think Big, a Teradata
More informationIan Choy. Technology Solutions Professional
Ian Choy Technology Solutions Professional XML KPIs SQL Server 2000 Management Studio Mirroring SQL Server 2005 Compression Policy-Based Mgmt Programmability SQL Server 2008 PowerPivot SharePoint Integration
More informationAcquiring Big Data to Realize Business Value
Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationHadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)
Bigdata Fundamentals Day1: (2hours) 1. Understanding BigData. a. What is Big Data? b. Big-Data characteristics. c. Challenges with the traditional Data Base Systems and Distributed Systems. 2. Distributions:
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationCertified Big Data Hadoop and Spark Scala Course Curriculum
Certified Big Data Hadoop and Spark Scala Course Curriculum The Certified Big Data Hadoop and Spark Scala course by DataFlair is a perfect blend of indepth theoretical knowledge and strong practical skills
More informationOracle Big Data. A NA LYT ICS A ND MA NAG E MENT.
Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem
More informationHortonworks and The Internet of Things
Hortonworks and The Internet of Things Dr. Bernhard Walter Solutions Engineer About Hortonworks Customer Momentum ~700 customers (as of November 4, 2015) 152 customers added in Q3 2015 Publicly traded
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationCertified Big Data and Hadoop Course Curriculum
Certified Big Data and Hadoop Course Curriculum The Certified Big Data and Hadoop course by DataFlair is a perfect blend of in-depth theoretical knowledge and strong practical skills via implementation
More informationReport on The Infrastructure for Implementing the Mobile Technologies for Data Collection in Egypt
Report on The Infrastructure for Implementing the Mobile Technologies for Data Collection in Egypt Date: 10 Sep, 2017 Draft v 4.0 Table of Contents 1. Introduction... 3 2. Infrastructure Reference Architecture...
More informationDelving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture
Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Hadoop 1.0 Architecture Introduction to Hadoop & Big Data Hadoop Evolution Hadoop Architecture Networking Concepts Use cases
More informationInternational Journal of Advance Engineering and Research Development. A Study: Hadoop Framework
Scientific Journal of Impact Factor (SJIF): e-issn (O): 2348- International Journal of Advance Engineering and Research Development Volume 3, Issue 2, February -2016 A Study: Hadoop Framework Devateja
More informationHadoop: Components and Working
Volume 6, No. 7, September-October 2015 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info Hadoop: Components and Working Nitika Arora Govt.College
More informationBig Data Analytics. Description:
Big Data Analytics Description: With the advance of IT storage, pcoressing, computation, and sensing technologies, Big Data has become a novel norm of life. Only until recently, computers are able to capture
More informationInternational Journal of Advance Engineering and Research Development. A study based on Cloudera's distribution of Hadoop technologies for big data"
Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 8, August -2017 e-issn (O): 2348-4470 p-issn (P): 2348-6406 A study
More informationJuxtaposition of Apache Tez and Hadoop MapReduce on Hadoop Cluster - Applying Compression Algorithms
, pp.289-295 http://dx.doi.org/10.14257/astl.2017.147.40 Juxtaposition of Apache Tez and Hadoop MapReduce on Hadoop Cluster - Applying Compression Algorithms Dr. E. Laxmi Lydia 1 Associate Professor, Department
More informationInformatica Enterprise Information Catalog
Data Sheet Informatica Enterprise Information Catalog Benefits Automatically catalog and classify all types of data across the enterprise using an AI-powered catalog Identify domains and entities with
More informationOracle GoldenGate for Big Data
Oracle GoldenGate for Big Data The Oracle GoldenGate for Big Data 12c product streams transactional data into big data systems in real time, without impacting the performance of source systems. It streamlines
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationHDInsight > Hadoop. October 12, 2017
HDInsight > Hadoop October 12, 2017 2 Introduction Mark Hudson >20 years mixing technology with data >10 years with CapTech Microsoft Certified IT Professional Business Intelligence Member of the Richmond
More informationBuilding an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle
Building an Integrated Big Data & Analytics Infrastructure September 25, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationA Review Paper on Big data & Hadoop
A Review Paper on Big data & Hadoop Rupali Jagadale MCA Department, Modern College of Engg. Modern College of Engginering Pune,India rupalijagadale02@gmail.com Pratibha Adkar MCA Department, Modern College
More informationWhen, Where & Why to Use NoSQL?
When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationIntegrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers
Oracle zsig Conference IBM LinuxONE and z System Servers Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Sam Amsavelu Oracle on z Architect IBM Washington
More information1Z Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions
1Z0-449 Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions Table of Contents Introduction to 1Z0-449 Exam on Oracle Big Data 2017 Implementation Essentials... 2 Oracle 1Z0-449
More informationOracle Data Integrator 12c: Integration and Administration
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle Data Integrator 12c: Integration and Administration Duration: 5 Days What you will learn Oracle Data Integrator is a comprehensive
More informationA Survey on Big Data
A Survey on Big Data D.Prudhvi 1, D.Jaswitha 2, B. Mounika 3, Monika Bagal 4 1 2 3 4 B.Tech Final Year, CSE, Dadi Institute of Engineering & Technology,Andhra Pradesh,INDIA ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More informationdocs.hortonworks.com
docs.hortonworks.com : Getting Started Guide Copyright 2012, 2014 Hortonworks, Inc. Some rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing,
More informationIntroduction to Big Data. Hadoop. Instituto Politécnico de Tomar. Ricardo Campos
Instituto Politécnico de Tomar Introduction to Big Data Hadoop Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in this presentation
More informationBIG DATA ANALYTICS A PRACTICAL GUIDE
BIG DATA ANALYTICS A PRACTICAL GUIDE STEP 1: GETTING YOUR DATA PLATFORM IN ORDER Big Data Analytics A Practical Guide / Step 1: Getting your Data Platform in Order 1 INTRODUCTION Everybody keeps extolling
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationOracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data
Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous
More informationDATACENTER SERVICES DATACENTER
SERVICES SOLUTION SUMMARY ALL CHANGE React, grow and innovate faster with Computacenter s agile infrastructure services Customers expect an always-on, superfast response. Businesses need to release new
More informationBig Data and Hadoop. Course Curriculum: Your 10 Module Learning Plan. About Edureka
Course Curriculum: Your 10 Module Learning Plan Big Data and Hadoop About Edureka Edureka is a leading e-learning platform providing live instructor-led interactive online training. We cater to professionals
More informationIntroduction to Hadoop. Owen O Malley Yahoo!, Grid Team
Introduction to Hadoop Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More information