TESTING BIG DATA WORLD RIGA. by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY
|
|
- Anna Daisy McLaughlin
- 6 years ago
- Views:
Transcription
1 RIGA TESTING BIG DATA WORLD by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY
2 BIG DATA IS NOT ABOUT THE DATA THE REVOLUTION IS NOT THAT THERE S MORE DATA AVAILABLE THE REVOLUTION IS THAT WE KNOW WHAT TO DO WITH THIS DATA NOW GARY KING PROFESSOR OF HARVARD UNIVERSITY
3 65 MILLION SOCIAL MEDIA POSTS PER DAY 1 HOUR, 16 MINUTES EACH DAY WATCHING VIDEO ON DIGITAL DEVICES 100 MILLIONS NETFLIX STREAMING SUBSCRIBERS
4 WHAT IS BIG DATA
5 WHAT BIG DATA IS NOT TRADITIONAL DATA LIKE DOCUMENTS AND DATABASES JUST A SYNONYM FOR LOTS OF DATA IT S NOT ABOUT THE DATA
6 WHAT BIG DATA IS BIG DATA IS THE LARGE VOLUME OF DATA ON A DAY-TO- DAY BASIS WHAT DO WITH THE DATA MATTERS BIG DATA FOR MADE A BETTER DECISIONS AND STRATEGIC BUSINESS MOVES
7 3 BIG DATA PROJECTS IN ACCENTURE ON LAST 2 YEARS 1. USER DATA PROCESSING AND SERIALIZATION 2. VIDEO ON DEMAND DATA PROCESSING AND BUSINESS INTELLIGENCE DASHBOARDS 3. VIDEO ON DEMAND DATA PROCESSING FOR DATA SCIENTISTS
8 BIG DATA V 3 VOLUME VELOCITY BIG DATA VARIETY
9 BIG DATA V 3 VOLUME VARIETY VELOCITY
10 V 1 - VOLUME FACEBOOK STATISTIC 100 PB DISK SPACE ON SINGLE HDFS CLUSTER 150 TB DATA SCANNING IN HIVE EACH 30 MINUTES 500 TB NEW DATA INGESTED PER DAY
11 V 2 - VARIETY FACEBOOK STATISTIC 300 MB IMAGES UPLOADED EACH MINUTE 4.7 BILLION STATUS UPDATES EACH DAY 1 BILLION VIDEO WATCH'S PER DAY
12 V 3 - VELOCITY NETFLIX STATISTIC 37% OF ALL DOWNSTREAM INTERNET BANDWIDTH IN NORTH AMERICA REAL TIME DATA ANALYSIS PROCESS 100,000 SERVER INSTANCES
13 HOW BIG DATA IS PROCESSED Web site Validation Mobile app 3 rd app Data storage 3 rd app 3 rd app FTP S3 Hadoop HDD Processing MapR Hive Pig Data Scientist/Analytics Data Base Oracle RedShift MongoDb BI tools QlikView QlikSance Splunk
14 HOW BIG DATA IS PROCESSED Web site Validation Mobile app 3 rd app Data storage 3 rd app 3 rd app FTP S3 Hadoop HDD Processing MapR Hive Pig Data Scientist/Analytics Data Base Oracle RedShift MongoDb BI tools QlikView QlikSance Splunk
15 MAP REDUCE EXAMPLE MAPPING SHUFFLING REDUCING
16 BIG DATA TESTING VS TRADITIONAL TESTING
17 Properties Traditional testing Big data testing Data Testers works with structured data Testing approachs defined and time-tested Tester has the option of "Sampling" strategy doing manually Testers works with both structured and unstructured data Testing approach requires focused R&D efforts "Sampling" strategy in Big data is a challenge
18 Properties Traditional testing Big data testing Infrastructure Validation Tools It does not require special test environment as the file size is limited Tester uses either the Excel based macros or UI based automation tools Testing Tools can be used with basic operating knowledge and less training. It requires special test environment due to large data size and files (HDFS) No defined tools, the range is vast from programming tools like MapReduce to HIVEQL It requires a specific set of skills and training to operate tools. Copyright 2014 Accenture All rights reserved. 18
19 TESTING CHALLENGES 1. UNDERSTAND YOUR DATA 2. UNDERSTAND YOUR PROCESS (BATCH/REAL) 3. AUTOMATE AS MUCH AS POSSIBLE 4. READY TO A LOT OF PERFORMANCE TESTING 19
20 UNDERSTAND YOUR DATA
21 BATCH / REAL TIME BATCH SYSTEM 1. DATA GENERATION 2. DATA STAGING VALIDATION 3. DATA PROCESSING VALIDATION 4. OUTPUT VALIDATION REAL TIME SYSTEM 1. ACTIVITY SIMULATION 2. VELOCITY ANALYSIS 3. METRIC ANALYSIS 4. OUTPUT VALIDATION 21
22 DATA STAGING VALIDATION Web site Validation Mobile app 3 rd app Data storage 3 rd app 3 rd app FTP S3 Hadoop HDD Processing MapR Hive Pig Data Scientist/Analytics Data Base Oracle RedShift MongoDb BI tools QlikView QlikSance Splunk
23 DATA PROCESSING VALIDATION Web site Validation Mobile app 3 rd app Data storage 3 rd app 3 rd app FTP S3 Hadoop HDD Processing MapR Hive Pig Data Scientist/Analytics Data Base Oracle RedShift MongoDb BI tools QlikView QlikSance Splunk
24 OUTPUT VALIDATION Web site Validation Mobile app 3 rd app Data storage 3 rd app 3 rd app FTP S3 Hadoop HDD Processing MapR Hive Pig Data Scientist/Analytics Data Base Oracle RedShift MongoDb BI tools QlikView QlikSance Splunk
25 TEST AUTOMATION TOOLS DATA DRIVEN TESTING TALEND OPEN STUDIO DATA GENERATORS MAP REDUCE ACCEPTANCE TESTING TALEND OPEN STUDIO QUERY SURGE STUDIO PERFORMANCE TESTING GATLING TOOL DATA PRODUCERS OUTPUT TESTING BDD ACCEPTANCE SCENARIO PREPARED STATIC DATA
26 TALEND OPEN STUDIO
27 TALEND OPEN STUDIO MAP REDUCE
28 QUESTION TIME
29 THANK YOU
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem Zohar Elkayam www.realdbamagic.com Twitter: @realmgic Who am I? Zohar Elkayam, CTO at Brillix Programmer, DBA, team leader, database trainer,
More informationBig Data on AWS. Peter-Mark Verwoerd Solutions Architect
Big Data on AWS Peter-Mark Verwoerd Solutions Architect What to get out of this talk Non-technical: Big Data processing stages: ingest, store, process, visualize Hot vs. Cold data Low latency processing
More informationHadoop An Overview. - Socrates CCDH
Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected
More informationBIG DATA TESTING: A UNIFIED VIEW
http://core.ecu.edu/strg BIG DATA TESTING: A UNIFIED VIEW BY NAM THAI ECU, Computer Science Department, March 16, 2016 2/30 PRESENTATION CONTENT 1. Overview of Big Data A. 5 V s of Big Data B. Data generation
More informationSQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism
Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationMaking the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor
Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,
More informationBest Practices and Performance Tuning on Amazon Elastic MapReduce
Best Practices and Performance Tuning on Amazon Elastic MapReduce Michael Hanisch Solutions Architect Amo Abeyaratne Big Data and Analytics Consultant ANZ 12.04.2016 2016, Amazon Web Services, Inc. or
More informationTop 25 Big Data Interview Questions And Answers
Top 25 Big Data Interview Questions And Answers By: Neeru Jain - Big Data The era of big data has just begun. With more companies inclined towards big data to run their operations, the demand for talent
More informationImporting and Exporting Data Between Hadoop and MySQL
Importing and Exporting Data Between Hadoop and MySQL + 1 About me Sarah Sproehnle Former MySQL instructor Joined Cloudera in March 2010 sarah@cloudera.com 2 What is Hadoop? An open-source framework for
More informationTalend Big Data Sandbox. Big Data Insights Cookbook
Overview Pre-requisites Setup & Configuration Hadoop Distribution Download Demo (Scenario) Overview Pre-requisites Setup & Configuration Hadoop Distribution Demo (Scenario) About this cookbook What is
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationA Review Paper on Big data & Hadoop
A Review Paper on Big data & Hadoop Rupali Jagadale MCA Department, Modern College of Engg. Modern College of Engginering Pune,India rupalijagadale02@gmail.com Pratibha Adkar MCA Department, Modern College
More informationIntroduction to Hadoop. Owen O Malley Yahoo!, Grid Team
Introduction to Hadoop Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since
More informationHadoop, Yarn and Beyond
Hadoop, Yarn and Beyond 1 B. R A M A M U R T H Y Overview We learned about Hadoop1.x or the core. Just like Java evolved, Java core, Java 1.X, Java 2.. So on, software and systems evolve, naturally.. Lets
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationThe Reality of Qlik and Big Data. Chris Larsen Q3 2016
The Reality of Qlik and Big Data Chris Larsen Q3 2016 Introduction Chris Larsen Sr Solutions Architect, Partner Engineering @Qlik Based in Lund, Sweden Primary Responsibility Advanced Analytics (and formerly
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationSpagoBI and Talend jointly support Big Data scenarios
SpagoBI and Talend jointly support Big Data scenarios Monica Franceschini - SpagoBI Architect SpagoBI Competency Center - Engineering Group Big-data Agenda Intro & definitions Layers Talend & SpagoBI SpagoBI
More informationA BigData Tour HDFS, Ceph and MapReduce
A BigData Tour HDFS, Ceph and MapReduce These slides are possible thanks to these sources Jonathan Drusi - SCInet Toronto Hadoop Tutorial, Amir Payberah - Course in Data Intensive Computing SICS; Yahoo!
More informationResource and Performance Distribution Prediction for Large Scale Analytics Queries
Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationData Platforms and Pattern Mining
Morteza Zihayat Data Platforms and Pattern Mining IBM Corporation About Myself IBM Software Group Big Data Scientist 4Platform Computing, IBM (2014 Now) PhD Candidate (2011 Now) 4Lassonde School of Engineering,
More informationTaming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems
1 Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems The Defacto Choice For Convergence 2 ABSTRACT & SPEAKER BIO Dealing with enormous data growth is a key challenge for
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce Who Am I - Ryan Tabora - Data Developer at Think Big Analytics - Big Data Consulting - Experience working with Hadoop, HBase, Hive, Solr, Cassandra, etc. 2 Who Am I -
More informationNowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?
Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/
More informationBig Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big Data on AWS Big Data Agility and Performance Delivered in the Cloud 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Technologies and techniques for working productively
More informationData Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros
Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationEvolution of Big Data Facebook. Architecture Summit, Shenzhen, August 2012 Ashish Thusoo
Evolution of Big Data Architectures@ Facebook Architecture Summit, Shenzhen, August 2012 Ashish Thusoo About Me Currently Co-founder/CEO of Qubole Ran the Data Infrastructure Team at Facebook till 2011
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationBig Data Facebook
Big Data Architectures@ Facebook QCon London 2012 Ashish Thusoo Outline Big Data @ Facebook - Scope & Scale Evolution of Big Data Architectures @ FB Past, Present and Future Questions Big Data @ FB: Scale
More informationSTATS Data Analysis using Python. Lecture 7: the MapReduce framework Some slides adapted from C. Budak and R. Burns
STATS 700-002 Data Analysis using Python Lecture 7: the MapReduce framework Some slides adapted from C. Budak and R. Burns Unit 3: parallel processing and big data The next few lectures will focus on big
More informationYuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013
Yuval Carmel Tel-Aviv University "Advanced Topics in About & Keywords Motivation & Purpose Assumptions Architecture overview & Comparison Measurements How does it fit in? The Future 2 About & Keywords
More informationAn InterSystems Guide to the Data Galaxy. Benjamin De Boe Product Manager
An InterSystems Guide to the Data Galaxy Benjamin De Boe Product Manager Analytics 3 InterSystems Corporation. All rights reserved. 4 InterSystems Corporation. All rights reserved. 5 InterSystems Corporation.
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationConfiguring and Deploying Hadoop Cluster Deployment Templates
Configuring and Deploying Hadoop Cluster Deployment Templates This chapter contains the following sections: Hadoop Cluster Profile Templates, on page 1 Creating a Hadoop Cluster Profile Template, on page
More informationBig Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours
Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals
More informationAcquiring Big Data to Realize Business Value
Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationThe Technology of the Business Data Lake. Appendix
The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform
More informationMixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp
MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage
More informationTowards a Real- time Processing Pipeline: Running Apache Flink on AWS
Towards a Real- time Processing Pipeline: Running Apache Flink on AWS Dr. Steffen Hausmann, Solutions Architect Michael Hanisch, Manager Solutions Architecture November 18 th, 2016 Stream Processing Challenges
More informationBig Data Programming: an Introduction. Spring 2015, X. Zhang Fordham Univ.
Big Data Programming: an Introduction Spring 2015, X. Zhang Fordham Univ. Outline What the course is about? scope Introduction to big data programming Opportunity and challenge of big data Origin of Hadoop
More informationHDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung
HDFS: Hadoop Distributed File System CIS 612 Sunnie Chung What is Big Data?? Bulk Amount Unstructured Introduction Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per
More informationBull Fast Track/PDW and Big Data
Bull Fast Track/PDW and Big Data Add High Performance BI to your Big Data Roger Van Unen Expert Microsoft / BI roger.van-unen@bull.net http://www.bull.fr/bi/fastrack.html Michael Schmitter BI Sales Germany
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn.?? New Batches Info
We are ready to serve Latest Testing Trends, Are you ready to learn.?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : About Quality Thought We are
More informationThe Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou
The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationBig Data Hadoop Course Content
Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux
More informationThe Mathematics of Big Data
The Mathematics of Big Data Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) Math & Big Data Fall 2017 1 / 10 Introduction We briefly present Big Data and the issues associated with Big Data. Philippe
More informationChase Wu New Jersey Institute of Technology
CS 644: Introduction to Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Institute of Technology Some of the slides were provided through the courtesy of Dr. Ching-Yung Lin at Columbia
More informationJoe Hummel, PhD. Visiting Researcher: U. of California, Irvine Adjunct Professor: U. of Illinois, Chicago & Loyola U., Chicago
Joe Hummel, PhD Visiting Researcher: U. of California, Irvine Adjunct Professor: U. of Illinois, Chicago & Loyola U., Chicago Materials: http://www.joehummel.net/downloads.html Email: joe@joehummel.net
More informationHADOOP FRAMEWORK FOR BIG DATA
HADOOP FRAMEWORK FOR BIG DATA Mr K. Srinivas Babu 1,Dr K. Rameshwaraiah 2 1 Research Scholar S V University, Tirupathi 2 Professor and Head NNRESGI, Hyderabad Abstract - Data has to be stored for further
More informationNew Challenges in Big Data: Technical Perspectives. Hwanjo Yu POSTECH
New Challenges in Big Data: Technical Perspectives Hwanjo Yu POSTECH http:/hwanjoyu.org Over 1 Billion SNS users!! Viral Marketing Word-of-Mouth Effect > TV advertising......... Influence Maximization
More informationOverview of Data Services and Streaming Data Solution with Azure
Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server
More information2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice
2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data
More informationProcessing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.
Processing Unstructured Data Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. http://dinesql.com / Dinesh Priyankara @dinesh_priya Founder/Principal Architect dinesql Pvt Ltd. Microsoft Most
More informationARC 306: Lumberjacking on AWS Cutting Through Logs to Find What Matters
ARC 306: Lumberjacking on AWS Cutting Through Logs to Find What Matters Guy Ernest, Solutions Architecture November 15, 2013 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied,
More informationTalend Big Data Sandbox. Big Data Insights Cookbook
Overview Pre-requisites Setup & Configuration Hadoop Distribution Download Demo (Scenario) Overview Pre-requisites Setup & Configuration Hadoop Distribution Demo (Scenario) About this cookbook What is
More informationBring Context To Your Machine Data With Hadoop, RDBMS & Splunk
Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may
More informationFEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market
E TL VALIDATOR DATA SHEET FEATURES BENEFITS SUPPORTED PLATFORMS ETL Testing Automation Data Quality Testing Flat File Testing Big Data Testing Data Integration Testing Wizard Based Test Creation No Custom
More informationWhere We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 22: MapReduce We are talking about parallel query processing There exist two main types of engines: Parallel DBMSs (last lecture + quick review)
More informationBig Data Infrastructure at Spotify
Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure September 26, 2013 2 Who am I? According to ZDNet: "The work they have done to improve the Apache Hive data warehouse system
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle
More informationWebinar Series TMIP VISION
Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing
More informationDHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI
DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Information Technology IT6701 - INFORMATION MANAGEMENT Anna University 2 & 16 Mark Questions & Answers Year / Semester: IV / VII Regulation: 2013
More informationThe amount of data increases every day Some numbers ( 2012):
1 The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect
More information2/26/2017. The amount of data increases every day Some numbers ( 2012):
The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect to
More informationIntroduction to the Mathematics of Big Data. Philippe B. Laval
Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2017 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationBig Data Hadoop Stack
Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware
More informationHadoop. Introduction / Overview
Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures
More informationBased on Big Data: Hype or Hallelujah? by Elena Baralis
Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationReal Time for Big Data: The Next Age of Data Management. Talksum, Inc. Talksum, Inc. 582 Market Street, Suite 1902, San Francisco, CA 94104
Real Time for Big Data: The Next Age of Data Management Talksum, Inc. Talksum, Inc. 582 Market Street, Suite 1902, San Francisco, CA 94104 Real Time for Big Data The Next Age of Data Management Introduction
More informationSTUDENT GRADE IMPROVEMENT INHIGHER STUDIES
STUDENT GRADE IMPROVEMENT INHIGHER STUDIES Sandhya P. Pandey Assistant Professor, The S.I.A college of Higher Education, Dombivili( E), Thane, Maharastra. Abstract: In India Higher educational institutions
More informationPercona Live September 21-23, 2015 Mövenpick Hotel Amsterdam
Percona Live 2015 September 21-23, 2015 Mövenpick Hotel Amsterdam MongoDB, Elastic, and Hadoop: The What, When, and How Kimberly Wilkins Principal Engineer/Database Denizen ObjectRocket/Rackspace kimberly@objectrocket.com
More informationIT directors, CIO s, IT Managers, BI Managers, data warehousing professionals, data scientists, enterprise architects, data architects
Organised by: www.unicom.co.uk OVERVIEW This two day workshop is aimed at getting Data Scientists, Data Warehousing and BI professionals up to scratch on Big Data, Hadoop, other NoSQL DBMSs and Multi-Platform
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationNew Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply
New Data Architectures For Netflow Analytics NANOG 74 Fangjin Yang - Cofounder @ Imply The Problem Comparing technologies Overview Operational analytic databases Try this at home The Problem Netflow data
More informationEMC s IT TRANSFORMATION
EMC s IT TRANSFORMATION Sanjay Mirchandani Chief Information Officer 1 EMC IT At A Glance INTERNAL USERS IT ENVIRONMENT BUSINESS APPLICATIONS VIRTUALIZATION 2004 24,000 5 DATA CENTERS, 960 TB STORAGE ~400
More informationA Survey on Big Data, Hadoop and it s Ecosystem
A Survey on Big Data, Hadoop and it s Ecosystem Jyotsna Y. Mali jyotsmali@gmail.com Abhaysinh V. Surve avsurve@tkietwarana.org Vaishali R. Khot vaishalikhot25@gmail.com Anita A. Bhosale bhosale.anita11@gmail.com
More informationWindows Azure Overview
Windows Azure Overview Christine Collet, Genoveva Vargas-Solar Grenoble INP, France MS Azure Educator Grant Packaged Software Infrastructure (as a Service) Platform (as a Service) Software (as a Service)
More informationBig Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture
Big Data Syllabus Hadoop YARN Setup Programming in YARN framework j Understanding big data and Hadoop Big Data Limitations and Solutions of existing Data Analytics Architecture Hadoop Features Hadoop Ecosystem
More informationIntroduction to Big-Data
Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,
More informationIntegrating Splunk with AWS services:
Integrating Splunk with AWS services: Using Redshi+, Elas0c Map Reduce (EMR), Amazon Machine Learning & S3 to gain ac0onable insights via predic0ve analy0cs via Splunk Patrick Shumate Solutions Architect,
More informationHDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1
HDFS Federation Sanjay Radia Founder and Architect @ Hortonworks Page 1 About Me Apache Hadoop Committer and Member of Hadoop PMC Architect of core-hadoop @ Yahoo - Focusing on HDFS, MapReduce scheduler,
More informationMAPR DATA GOVERNANCE WITHOUT COMPROMISE
MAPR TECHNOLOGIES, INC. WHITE PAPER JANUARY 2018 MAPR DATA GOVERNANCE TABLE OF CONTENTS EXECUTIVE SUMMARY 3 BACKGROUND 4 MAPR DATA GOVERNANCE 5 CONCLUSION 7 EXECUTIVE SUMMARY The MapR DataOps Governance
More informationThe age of Big Data Big Data for Oracle Database Professionals
The age of Big Data Big Data for Oracle Database Professionals Oracle OpenWorld 2017 #OOW17 SessionID: SUN5698 Tom S. Reddy tom.reddy@datareddy.com About the Speaker COLLABORATE & OpenWorld Speaker IOUG
More informationProcessing of big data with Apache Spark
Processing of big data with Apache Spark JavaSkop 18 Aleksandar Donevski AGENDA What is Apache Spark? Spark vs Hadoop MapReduce Application Requirements Example Architecture Application Challenges 2 WHAT
More informationA Review Approach for Big Data and Hadoop Technology
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 A Review Approach for Big Data and Hadoop Technology Prof. Ghanshyam Dhomse
More information@Pentaho #BigDataWebSeries
Enterprise Data Warehouse Optimization with Hadoop Big Data @Pentaho #BigDataWebSeries Your Hosts Today Dave Henry SVP Enterprise Solutions Davy Nys VP EMEA & APAC 2 Source/copyright: The Human Face of
More information