Migrating from Oracle to Espresso
|
|
- Roberta Paul
- 5 years ago
- Views:
Transcription
1 Migrating from Oracle to Espresso David Max Senior Software Engineer LinkedIn
2 About LinkedIn New York Engineering Located in Empire State Building Approximately 100 engineers and 1000 employees total New York Engineering Multiple teams, front end, back end, and data science
3 About Me Software Engineer at LinkedIn NYC since 2015 Content Ingestion team Office Hours Thursday 11:30-12:00 David Max Senior Software Engineer LinkedIn
4 What is Content Ingestion? Content Ingestion Babylonia
5 Babylonia Content Ingestion
6 Babylonia Content Ingestion
7 url: title: "SATURN 2017 Keynote: Software is Details Babylonia Content Ingestion image: poaymweyckgbef5ivfkriqkdcwgbfqaaieiyaxab\\u00 26rs=AOn4CLClwjQlBmMeoRCePtHaThN-qXRHqg
8 Babylonia Content Ingestion
9 What is Content Ingestion? Extracts metadata from web pages Source of Truth for 3 rd party content Also contains metadata for some public 1 st party content Babylonia Content Ingestion Used by LinkedIn services for sharing, decorating, and embedding content Data also feeds into content understanding and relevance models
10 Babylonia Datasets HDFS ETL Babylonia Content Ingestion Data Change Events
11 Downstream and Upstream Datasets HDFS ETL Offline Babylonia Content Ingestion Data Change Events Near Line
12 Babylonia use of Oracle (before migration) RDBMS Relational Management System Databus Platform for streaming data change events to near line consumers Offline ETL to HDFS for offline consumers Schema Metadata extracted from each URL stored in individual rows Client Babylonia the main (but not only) client to directly execute queries on Oracle DB Rest.li Most online interaction with dataset in Oracle via Babylonia s Rest.li API
13 Espresso is LinkedIn s strategic distributed, fault-tolerant NoSQL What is Espresso? database that powers many of LinkedIn s services ~100 clusters in use* ~420TB of SoT data* ~2 million qps at peak load* * as of August 1, 2017
14 What is Espresso? NoSQL Non relational Distributed A single database can be distributed over a cluster of machines Scalable Able to scale clusters horizontally by adding more nodes Document A table is a container for documents of the same schema (defined in Avro) Keys Documents index by key fields, which are defined in the table schema
15 Why Migrate? Maintenance Babylonia s Oracle tables required periodic jobs to be run that involved downtime for each Integration Support for Espresso integrated with other tools and systems at LinkedIn server Rest.li Espresso s API is based on Cost Oracle more expensive to run Strategy Espresso is the preferred platform at LinkedIn for data of this type Support Espresso team part of LinkedIn Rest.li, which makes it easier to treat Espresso endpoints like other LinkedIn Rest.li endpoints Schema Evolution Supported with zero downtime and no coordination with DBA teams
16 Data Formats (Oracle) Rest.li Pegasus Object Oracle Row Oracle Row Endpoints Oracle Row Oracle HDFS Offline Pegasus ETL Data Babylonia Content Ingestion Oracle Databus Events Complex transformation between Oracle format and Pegasus format Near Line Oracle Row
17 Pegasus and Avro Pegasus and Avro schema Pegasus Schema Avro Schema definitions are very similar Both can be used to generate Java objects with very similar interfaces Java Objects Java Objects Pegasus schema can be used to auto-generate the Avro schema
18 Data Formats (Espresso) Rest.li Pegasus Object Espresso Avro Espresso Avro Endpoints Espresso Avro Espresso HDFS Offline Pegasus ETL Data Babylonia Content Ingestion Espresso Brooklin Events Simple transformation between Avro format and Pegasus format Near Line Espresso Avro
19 Why Migrate? Schema Evolution Espresso ALTER TABLE Document schema auto-registration Not tied to code deployment need to coordinate with DBAs Schema changes are registered automatically as part of the Babylonia deployment process Schema change involves server downtime Backwards compatibility is enforced existing data does not need to be In practice, developers go to great lengths to avoid the hassle transformed Avro schema more natural fit with Schema accumulates tech debt Rest.li Pegasus schema
20 Zero down time Goals for Migration Process Transparent to Rest.li clients Give offline and nearline consumers time to migrate Validate each step Mirroring in real time
21 Pre-Migration State of Babylonia Oracle HDFS ETL Offline Babylonia Content Ingestion Oracle Databus Events Near Line
22 Pre-Migration State of Babylonia Rest.li Endpoints Oracle Rest.li Calls Oracle Databus Events Other Services
23 Pre-Migration Cleanup Rest.li Endpoints Identify code that is Oracle tightly-coupled to the database Rest.li Calls Oracle Databus Events Decide which code should be reimplemented for Espresso, and which code should be decoupled or eliminated. Other Services Reduce number of code paths to migrate The easiest lines of code to migrate are the lines of code that don t exist
24 Bootstrap Espresso Oracle HDFS ETL Offline Convert Job Espresso Espresso Bulk Loader Avro Data File
25 Bootstrap Espresso Oracle HDFS ETL Espresso
26 Databus Listener, Shadow Read Validation Shadow Read Oracle Validation Oracle Databus Events Espresso Databus Listener
27 Direct Writes to Espresso Shadow Read Oracle Validation Oracle Databus Events Direct Write Espresso Databus Listener
28 Resolving Write Conflicts Migration Control optional field added to scheme Dual Write Conflict Databus Listener and Babylonia updating same record indicating which process wrote the record: Bulk Loader, Databus listener, or Babylonia Oracle Databus Events Direct Write Espresso Databus Listener
29 Espresso New SoT Dual Writes Oracle Deprecated Oracle Databus Events Direct Read/Write Espresso Espresso Brooklin Events
30 Oracle Turnoff Direct Read/Write Espresso Espresso Brooklin Events
31 Thank you
Data Infrastructure at LinkedIn. Shirshanka Das XLDB 2011
Data Infrastructure at LinkedIn Shirshanka Das XLDB 2011 1 Me UCLA Ph.D. 2005 (Distributed protocols in content delivery networks) PayPal (Web frameworks and Session Stores) Yahoo! (Serving Infrastructure,
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle
More informationBig Data Integration Patterns. Michael Häusler Jun 12, 2017
Big Data Integration Patterns Michael Häusler Jun 12, 2017 ResearchGate is built for scientists. The social network gives scientists new tools to connect, collaborate, and keep up with the research that
More informationExtending the Scope of Custom Transformations
Paper 3306-2015 Extending the Scope of Custom Transformations Emre G. SARICICEK, The University of North Carolina at Chapel Hill. ABSTRACT Building and maintaining a data warehouse can require complex
More informationGriddable.io architecture
Griddable.io architecture Executive summary This whitepaper presents the architecture of griddable.io s smart grids for synchronized data integration. Smart transaction grids are a novel concept aimed
More informationEsper EQC. Horizontal Scale-Out for Complex Event Processing
Esper EQC Horizontal Scale-Out for Complex Event Processing Esper EQC - Introduction Esper query container (EQC) is the horizontal scale-out architecture for Complex Event Processing with Esper and EsperHA
More informationAn Information Asset Hub. How to Effectively Share Your Data
An Information Asset Hub How to Effectively Share Your Data Hello! I am Jack Kennedy Data Architect @ CNO Enterprise Data Management Team Jack.Kennedy@CNOinc.com 1 4 Data Functions Your Data Warehouse
More informationApache HBase Andrew Purtell Committer, Apache HBase, Apache Software Foundation Big Data US Research And Development, Intel
Apache HBase 0.98 Andrew Purtell Committer, Apache HBase, Apache Software Foundation Big Data US Research And Development, Intel Who am I? Committer on the Apache HBase project Member of the Big Data Research
More informationApache Hadoop Goes Realtime at Facebook. Himanshu Sharma
Apache Hadoop Goes Realtime at Facebook Guide - Dr. Sunny S. Chung Presented By- Anand K Singh Himanshu Sharma Index Problem with Current Stack Apache Hadoop and Hbase Zookeeper Applications of HBase at
More informationEvolution of an Apache Spark Architecture for Processing Game Data
Evolution of an Apache Spark Architecture for Processing Game Data Nick Afshartous WB Analytics Platform May 17 th 2017 May 17 th, 2017 About Me nafshartous@wbgames.com WB Analytics Core Platform Lead
More informationState of the Dolphin Developing new Apps in MySQL 8
State of the Dolphin Developing new Apps in MySQL 8 Highlights of MySQL 8.0 technology updates Mark Swarbrick MySQL Principle Presales Consultant Jill Anolik MySQL Global Business Unit Israel Copyright
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationBuilding LinkedIn s Real-time Data Pipeline. Jay Kreps
Building LinkedIn s Real-time Data Pipeline Jay Kreps What is a data pipeline? What data is there? Database data Activity data Page Views, Ad Impressions, etc Messaging JMS, AMQP, etc Application and
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationSchema Registry Overview
3 Date of Publish: 2018-11-15 https://docs.hortonworks.com/ Contents...3 Examples of Interacting with Schema Registry...4 Schema Registry Use Cases...6 Use Case 1: Registering and Querying a Schema for
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More informationPNDA.io: when BGP meets Big-Data
PNDA.io: when BGP meets Big-Data Let s go back in time 26 th April 2017 The Internet is very much alive Millions of BGP events occurring every day 15 Routers Monitored 410 active peers (both IPv4 and IPv6)
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationManaging Data at Scale: Microservices and Events. Randy linkedin.com/in/randyshoup
Managing Data at Scale: Microservices and Events Randy Shoup @randyshoup linkedin.com/in/randyshoup Background VP Engineering at Stitch Fix o Combining Art and Science to revolutionize apparel retail Consulting
More informationA Journey to DynamoDB
A Journey to DynamoDB and maybe away from DynamoDB Adam Dockter VP of Engineering ServiceTarget Who are we? Small Company 4 Developers AWS Infrastructure NO QA!! About our product Self service web application
More informationBIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,
BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT
More informationMicroservices without the Servers: AWS Lambda in Action
Microservices without the Servers: AWS Lambda in Action Dr. Tim Wagner, General Manager AWS Lambda August 19, 2015 Seattle, WA 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Two
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationA never-ending database migration
A never-ending database migration Charles Delort IT-DB November 20, 2017 Table of Contents Years ago, decisions were made A few years later PostgreSQL Foreign Data Wrappers First step of Migration Apiato
More informationDatabricks, an Introduction
Databricks, an Introduction Chuck Connell, Insight Digital Innovation Insight Presentation Speaker Bio Senior Data Architect at Insight Digital Innovation Focus on Azure big data services HDInsight/Hadoop,
More informationBig Data Analytics. Rasoul Karimi
Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Outline
More informationData Acquisition. The reference Big Data stack
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini The reference
More informationOwn change. TECHNICAL WHITE PAPER Data Integration With REST API
TECHNICAL WHITE PAPER Data Integration With REST API Real-time or near real-time accurate and fast retrieval of key metrics is a critical need for an organization. Many times, valuable data are stored
More informationVerarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016
Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Hans Viehmann Product Manager EMEA ORACLE Corporation 12. Mai 2016 Safe Harbor Statement The following
More informationDBManager Database operations management at Dropbox
DBManager Database operations management at Dropbox! David Turner MySQL Connect 2014 Who are we? Renjish Abraham - tech lead and vision for DBManager Aleksander Kuzminsky - backup and recovery expert David
More informationData Acquisition. The reference Big Data stack
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria Cardellini The reference
More informationDATABASE SCALE WITHOUT LIMITS ON AWS
The move to cloud computing is changing the face of the computer industry, and at the heart of this change is elastic computing. Modern applications now have diverse and demanding requirements that leverage
More informationRevamped and Automated the infrastructure for NTN Buzztime
Revamped and Automated the infrastructure for NTN Buzztime Executive Summary NTN Buzztime Inc. was looking for scalable infrastructure with a new platform that could support display of real-time restaurant
More informationScaling for Humongous amounts of data with MongoDB
Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationBlended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)
Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance
More informationScaling Marketplaces at Thumbtack QCon SF 2017
Scaling Marketplaces at Thumbtack QCon SF 2017 Nate Kupp Technical Infrastructure Data Eng, Experimentation, Platform Infrastructure, Security, Dev Tools Infrastructure from early beginnings You see that?
More informationTHE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES
1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationAn Insider s Guide to Oracle Autonomous Transaction Processing
An Insider s Guide to Oracle Autonomous Transaction Processing Maria Colgan Master Product Manager Troy Anthony Senior Director, Product Management #thinkautonomous Autonomous Database Traditionally each
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationPersonalizing Netflix with Streaming datasets
Personalizing Netflix with Streaming datasets Shriya Arora Senior Data Engineer Personalization Analytics @shriyarora What is this talk about? Helping you decide if a streaming pipeline fits your ETL problem
More informationScott Meder Senior Regional Sales Manager
www.raima.com Scott Meder Senior Regional Sales Manager scott.meder@raima.com Short Introduction to Raima What is Data Management What are your requirements? How do I make the right decision? - Architecture
More informationApache Hive for Oracle DBAs. Luís Marques
Apache Hive for Oracle DBAs Luís Marques About me Oracle ACE Alumnus Long time open source supporter Founder of Redglue (www.redglue.eu) works for @redgluept as Lead Data Architect @drune After this talk,
More informationDatabase Assessment for PDMS
Database Assessment for PDMS Abhishek Gaurav, Nayden Markatchev, Philip Rizk and Rob Simmonds Grid Research Centre, University of Calgary. http://grid.ucalgary.ca 1 Introduction This document describes
More informationIn-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet
In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet Ema Iancuta iorhian@gmail.com Radu Chilom radu.chilom@gmail.com Big data analytics / machine learning 6+ years
More informationBIG DATA REVOLUTION IN JOBRAPIDO
BIG DATA REVOLUTION IN JOBRAPIDO Michele Pinto Big Data Technical Team Leader @ Jobrapido Big Data Tech 2016 Firenze - October 20, 2016 ABOUT ME NAME Michele Pinto LINKEDIN https://www.linkedin.com/in/pintomichele
More informationIBM Data Replication for Big Data
IBM Data Replication for Big Data Highlights Stream changes in realtime in Hadoop or Kafka data lakes or hubs Provide agility to data in data warehouses and data lakes Achieve minimum impact on source
More informationRealtime visitor analysis with Couchbase and Elasticsearch
Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn @jreijn #nosql13 About me Jeroen Reijn Software engineer Hippo @jreijn http://blog.jeroenreijn.com About Hippo Visitor Analysis OneHippo
More informationBecause databases are not easily accessible by Hadoop, Apache Sqoop was created to efficiently transfer bulk data between Hadoop and external
Because databases are not easily accessible by Hadoop, Apache Sqoop was created to efficiently transfer bulk data between Hadoop and external structured datastores. The popularity of Sqoop in enterprise
More informationScaleArc for SQL Server
Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations
More informationTransformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's
Building Agile and Resilient Schema Transformations using Apache Kafka and ESB's Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's Ricardo Ferreira
More informationIntroduction to Oracle NoSQL Database
Introduction to Oracle NoSQL Database Anand Chandak Ashutosh Naik Agenda NoSQL Background Oracle NoSQL Database Overview Technical Features & Performance Use Cases 2 Why NoSQL? 1. The four V s of Big Data
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationMigrating Oracle Databases To Cassandra
BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra
More informationAdvanced Migration of Schema and Data across Multiple Databases
Advanced Migration of Schema and Data across Multiple Databases D.M.W.E. Dissanayake 139163B Faculty of Information Technology University of Moratuwa May 2017 Advanced Migration of Schema and Data across
More informationA Glimpse of the Hadoop Echosystem
A Glimpse of the Hadoop Echosystem 1 Hadoop Echosystem A cluster is shared among several users in an organization Different services HDFS and MapReduce provide the lower layers of the infrastructures Other
More informationEasy Ways to Improve Oracle Information Sharing With Data Replication
Easy Ways to Improve Oracle Information Sharing With Data Replication Replace Oracle Streams Susan Wong Solutions Architect August 20, 2015 Agenda Introduction to replication Value of replication solutions:
More informationModernizing Business Intelligence and Analytics
Modernizing Business Intelligence and Analytics Justin Erickson Senior Director, Product Management 1 Agenda What benefits can I achieve from modernizing my analytic DB? When and how do I migrate from
More informationPostgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24
Postgres-XC PostgreSQL Conference 2012 Michael PAQUIER Tokyo, 2012/02/24 Agenda Self-introduction Highlights of Postgres-XC Core architecture overview Performance High-availability Release status Copyright
More informationebay Marketplace Architecture
ebay Marketplace Architecture Architectural Strategies, Patterns, and Forces Randy Shoup, ebay Distinguished Architect QCon SF 2007 November 9, 2007 What we re up against ebay manages Over 248,000,000
More informationMicroservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan
Microservices at Netflix Scale First Principles, Tradeoffs, Lessons Learned Ruslan Meshenberg @rusmeshenberg Microservices: all benefits, no costs? Netflix is the world s leading Internet television network
More informationLambda Architecture for Batch and Stream Processing. October 2018
Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.
More informationThis presentation is for informational purposes only and may not be incorporated into a contract or agreement.
This presentation is for informational purposes only and may not be incorporated into a contract or agreement. Oracle10g RDF Data Mgmt: In Life Sciences Xavier Lopez Director, Server Technologies Oracle
More informationBig Data Technology Incremental Processing using Distributed Transactions
Big Data Technology Incremental Processing using Distributed Transactions Eshcar Hillel Yahoo! Ronny Lempel Outbrain *Based on slides by Edward Bortnikov and Ohad Shacham Roadmap Previous classes Stream
More informationOracle Streams. An Oracle White Paper October 2002
Oracle Streams An Oracle White Paper October 2002 Oracle Streams Executive Overview... 3 Introduction... 3 Oracle Streams Overview... 4... 5 Staging... 5 Propagation... 6 Transformations... 6 Consumption...
More informationObject Persistence Design Guidelines
Object Persistence Design Guidelines Motivation Design guideline supports architects and developers in design and development issues of binding object-oriented applications to data sources The major task
More informationScaling ML in Ad Tech. Giri Iyengar
Scaling ML in Ad Tech Giri Iyengar Agenda Introduction What are AdTech Platforms? Big Data in Ad Tech Some Data Science Projects in Ad Tech Technical & Operational Challenges In Search of an ML Platform
More informationIntroduction to NoSQL
Introduction to NoSQL Agenda History What is NoSQL Types of NoSQL The CAP theorem History - RDBMS Relational DataBase Management Systems were invented in the 1970s. E. F. Codd, "Relational Model of Data
More informationDatabase Administration. Database Administration CSCU9Q5. The Data Dictionary. 31Q5/IT31 Database P&A November 7, Overview:
Database Administration CSCU9Q5 Slide 1 Database Administration Overview: Data Dictionary Data Administrator Database Administrator Distributed Databases Slide 2 The Data Dictionary A DBMS must provide
More informationOutline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles
INF3190:Distributed Systems - Examples Thomas Plagemann & Roman Vitenberg Outline Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles Today: Examples Googel File System (Thomas)
More informationTour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect
Tour of Database Platforms as a Service June 2016 Warner Chaves Christo Kutrovsky Solutions Architect Bio Solutions Architect at Pythian Specialize high performance data processing and analytics 15 years
More informationOracle Associate User With Schema Difference Between
Oracle Associate User With Schema Difference Between Use the CREATE USER statement to create and configure a database user, which is an value because it might result in conflicts between the names of local
More information1
1 2 3 6 7 8 9 10 Storage & IO Benchmarking Primer Running sysbench and preparing data Use the prepare option to generate the data. Experiments Run sysbench with different storage systems and instance
More informationManual Trigger Sql Server 2008 Insert Multiple Rows At Once
Manual Trigger Sql Server 2008 Insert Multiple Rows At Once Adding SQL Trigger to update field on INSERT (multiple rows) However, if there are multiple records inserted (as in the user creates several
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationArchitecture of a Real-Time Operational DBMS
Architecture of a Real-Time Operational DBMS Srini V. Srinivasan Founder, Chief Development Officer Aerospike CMG India Keynote Thane December 3, 2016 [ CMGI Keynote, Thane, India. 2016 Aerospike Inc.
More informationSecurity and Performance advances with Oracle Big Data SQL
Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,
More informationNew Features Guide Sybase ETL 4.9
New Features Guide Sybase ETL 4.9 Document ID: DC00787-01-0490-01 Last revised: September 2009 This guide describes the new features in Sybase ETL 4.9. Topic Page Using ETL with Sybase Replication Server
More informationCS November 2018
Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationMySQL Group Replication. Bogdan Kecman MySQL Principal Technical Engineer
MySQL Group Replication Bogdan Kecman MySQL Principal Technical Engineer Bogdan.Kecman@oracle.com 1 Safe Harbor Statement The following is intended to outline our general product direction. It is intended
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationFLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM
FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM RECOMMENDATION AND JUSTIFACTION Executive Summary: VHB has been tasked by the Florida Department of Transportation District Five to design
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationSurvey of Oracle Database
Survey of Oracle Database About Oracle: Oracle Corporation is the largest software company whose primary business is database products. Oracle database (Oracle DB) is a relational database management system
More informationEvolution of Big Data Facebook. Architecture Summit, Shenzhen, August 2012 Ashish Thusoo
Evolution of Big Data Architectures@ Facebook Architecture Summit, Shenzhen, August 2012 Ashish Thusoo About Me Currently Co-founder/CEO of Qubole Ran the Data Infrastructure Team at Facebook till 2011
More informationAugust Oracle - GoldenGate Statement of Direction
August 2015 Oracle - GoldenGate Statement of Direction Disclaimer This document in any form, software or printed matter, contains proprietary information that is the exclusive property of Oracle. Your
More informationMongoDB - a No SQL Database What you need to know as an Oracle DBA
MongoDB - a No SQL Database What you need to know as an Oracle DBA David Burnham Aims of this Presentation To introduce NoSQL database technology specifically using MongoDB as an example To enable the
More informationAzure Certification BootCamp for Exam (Developer)
Azure Certification BootCamp for Exam 70-532 (Developer) Course Duration: 5 Days Course Authored by CloudThat Description Microsoft Azure is a cloud computing platform and infrastructure created for building,
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2013/14
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2013/14 MapReduce & Hadoop The new world of Big Data (programming model) Overview of this Lecture Module Background Cluster File
More informationComparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2014 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
More informationHighQSoft GmbH Big Data ODS. Setting up of a prototype
Big Data ODS Setting up of a prototype 1 Performance und Scalability Topics 1. Why Big Data? 2. General Overview 3. HighQSoft Approach 4. Summary 2 What is the ODS 6.0 Proposal? Overview ODS API Definition
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationIntroduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data
Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction
More information