real-time delivery architecture
|
|
- Milo Malone
- 5 years ago
- Views:
Transcription
1 real-time delivery uc berkeley - 27 august 2012
2 designing twitter
3 what are the goals? evolve from being solely a web stack
4 ROUTING PRESENTATION LOGIC STORAGE & RETRIEVAL T-Bird T-Flock + Haplo Monorail Darkwing Flock(s)
5 what are the goals? evolve from being solely a web stack isolate responsibilities and concerns site speed and reliability developer innovation speed
6
7 Pull Targeted twitter.com home_timeline API Queried Search API Push User / Site Streams Mobile Push (SMS, etc.) Track / Follow Streams
8
9
10
11
12 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Cache Fanout
13 Write API pipelined 4k destinations at a time Blender replicated Timeline Service HTTP Push Mobile Push Hadoop Batch Compute keyed off recipient Timeline Cache Search Cache insert Fanout Push Compute Ingester Social Graph Service
14 Write API Ingester RPUSHX to only add to cached timelines Blender Timeline Service Tweet IDPush User ID HTTP 8 bytes 8 bytes Mobile Push Bits Hadoop 4 bytes Batch Compute native list structure Push Compute Search Cache using redis Timeline Cache Fanout
15 Write API Ingester RPUSHX to only add to cached timelines Blender Timeline Service Tweet IDPush User ID HTTP Bits Tweet ID Bits Hadoop Tweet ID User ID Mobile Tweet ID User ID Push Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID User ID Bits Tweet ID Tweet ID Batch Compute native list structure Push Compute Search Cache using redis Timeline Cache Fanout
16 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Cache Fanout
17 Write API Fanout Timeline Cache Gizmoduck Timeline Service TweetyPie
18 Pull Targeted twitter.com home_timeline API Queried Search API Push User / Site Streams Mobile Push (SMS, etc.) Track / Follow Streams
19 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Cache Fanout
20 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Cache Fanout
21 Write API Ingester Blender Timeline Service HTTP Push Hadoop queries one replica of all indexes Mobile Push merges & ranks results Batch Compute blender Push Compute Timeline Cache Search Index Fanout
22 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
23 Pull Targeted twitter.com home_timeline API Queried Search API Push User / Site Streams Mobile Push (SMS, etc.) Track / Follow Streams
24 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
25 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
26 http push / hosebird maintains persistent connections with end clients processes tweet & social graph events event-based router
27 Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Follow event propagation write API sends all events into hosebird; sees content creation events, social graph changes, etc. different queues for public tweets, protected tweets, social events, etc.
28 Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Follow event cascading bandwidth management simultaneous connection management (~1m long lived & open connections to this cluster)
29 Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Follow firehose edge machine simply outputs the public tweet queue only allow a limited number of firehoses per hosebird box for bandwidth management
30 Hosebird Firehose Write API Hosebird Track / Follow Hosebird User Streams track / follow simple query based on tweet content keeps list of terms / users of interest parses public tweets at the edge, and if term matches a token, or user is of interest, then route
31 Hosebird Firehose Write API Hosebird Track / Follow Hosebird User Streams user streams replicate home timeline experience upon login, obtain following list keep cached following list coherent by seeing social graph updates route tweet if from a followed user
32 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
33 Pull Targeted twitter.com home_timeline API Queried Search API Push User / Site Streams Mobile Push (SMS, etc.) Track / Follow Streams
34 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
35 Write API Ingester Blender Timeline Service Mobile Push Hadoop Social Graph Service Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
36 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
37 Pull Targeted twitter.com home_timeline API Queried Search API Push User / Site Streams Mobile Push (SMS, etc.) Track / Follow Streams
38 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
39 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
40 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
41 Synchronous Path Write API Ingester Blender Timeline Service Mobile Push Asynchronous Path Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout Query Path
42 Synchronous Path Write API Ingester Blender Timeline Service Mobile Push Asynchronous Path Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout Query Path
43 Synchronous Path Write API Ingester Blender Timeline Service Mobile Push Asynchronous Path Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout Query Path
44 Blender Timeline Service Write Path HTTP Push Mobile Push Hadoop Batch Compute Timeline Cache Fanout Push Compute Ingester Search Index Read Path Write API
45 Blender Timeline Service Write Path HTTP Push Mobile Push Hadoop Batch Compute Timeline Cache Fanout Push Compute Ingester Search Index Read Path Write API
46 things we re trying...
47
48 Write API Ingester Fanout Search Index Timeline Cache search index [ hello, world ] fanout index [@danadanger,...]
49 User Intent Query Expansion Hello, world Hello AND s home timeline user_timeline:nelson OR user_timeline:danadanger
50 Write API fan-in fan-out O(1) write Ingester Fanout O(n) write O(n) read Search Index Timeline Cache O(1) read
51 User Intent Hello, world Query Expansion Hello AND s home timeline home_timeline:raffi
52 User Intent Query Expansion Hello, world Hello AND s home timeline home_timeline:raffi OR user_timeline:taylorswift13
53 streaming compute continuous computation driven by the events that come into twitter generalizing the push mechanism
54 Write API Ingester Blender Timeline Service Mobile Push Hadoop Batch Compute HTTP Push Push Compute Timeline Cache Search Index Fanout
55 timeline query statistics >150m active users worldwide 300k qps poll-based 1ms p50 / 4ms p99 30k qps search-based timelines
56 tweet input ~340m tweets per day ~4K/sec daily average ~6K/sec daily peak >10K/sec during large events
57
58 followed by following
59 timeline delivery statistics 26b deliveries / day (~18m / min) 3.5 p50 to deliver to 1m ~300k deliveries / sec
60 thanks!
Realtime & Personalized
Realtime & Personalized Notifications @Twitter @pathak_s @lamgary March 8 2017 I was following it on Twitter, I didn't actually see it live. I kept on refreshing my notifications, I saw people were tweeting
More informationData Acquisition. The reference Big Data stack
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria Cardellini The reference
More informationSolace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery
Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery Java Message Service (JMS) is a standardized messaging interface that has become a pervasive part of the IT landscape
More informationBuilding Durable Real-time Data Pipeline
Building Durable Real-time Data Pipeline Apache BookKeeper at Twitter @sijieg Twitter Background Layered Architecture Agenda Design Details Performance Scale @Twitter Q & A Publish-Subscribe Online services
More informationImprove Web Application Performance with Zend Platform
Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching
More informationReactive Microservices Architecture on AWS
Reactive Microservices Architecture on AWS Sascha Möllering Solutions Architect, @sascha242, Amazon Web Services Germany GmbH Why are we here today? https://secure.flickr.com/photos/mgifford/4525333972
More informationLast Class: RPCs and RMI. Today: Communication Issues
Last Class: RPCs and RMI Case Study: Sun RPC Lightweight RPCs Remote Method Invocation (RMI) Design issues Lecture 9, page 1 Today: Communication Issues Message-oriented communication Persistence and synchronicity
More informationPNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013
PNUTS: Yahoo! s Hosted Data Serving Platform Reading Review by: Alex Degtiar (adegtiar) 15-799 9/30/2013 What is PNUTS? Yahoo s NoSQL database Motivated by web applications Massively parallel Geographically
More informationHow you can benefit from using. javier
How you can benefit from using I was Lois Lane redis has super powers myth: the bottleneck redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop,mset -P 16 -q On my laptop: SET: 513610 requests
More informationThe Road to a Complete Tweet Index
The Road to a Complete Tweet Index Yi Zhuang Staff Software Engineer @ Twitter Outline 1. Current Scale of Twitter Search 2. The History of Twitter Search Infra 3. Complete Tweet Index 4. Search Engine
More informationBUILDING LARGE VOD LIBRARIES WITH NEXT GENERATION ON DEMAND ARCHITECTURE. Weidong Mao Comcast Fellow Office of the CTO Comcast Cable
BUILDING LARGE VOD LIBRARIES WITH NEXT GENERATION ON DEMAND ARCHITECTURE Weidong Mao Comcast Fellow Office of the CTO Comcast Cable Abstract The paper presents an integrated Video On Demand (VOD) content
More information<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns
QCon: London 2009 Data Grid Design Patterns Brian Oliver Global Solutions Architect brian.oliver@oracle.com Oracle Coherence Oracle Fusion Middleware Product Management Agenda Traditional
More informationAn Introduction to Apache Spark
An Introduction to Apache Spark 1 History Developed in 2009 at UC Berkeley AMPLab. Open sourced in 2010. Spark becomes one of the largest big-data projects with more 400 contributors in 50+ organizations
More informationTutorial 8 Build resilient, responsive and scalable web applications with SocketPro
Tutorial 8 Build resilient, responsive and scalable web applications with SocketPro Contents: Introduction SocketPro ways for resilient, responsive and scalable web applications Vertical scalability o
More informationAWS Lambda + nodejs Hands-On Training
AWS Lambda + nodejs Hands-On Training (4 Days) Course Description & High Level Contents AWS Lambda is changing the way that we build systems in the cloud. This new compute service in the cloud runs your
More informationApache BookKeeper. A High Performance and Low Latency Storage Service
Apache BookKeeper A High Performance and Low Latency Storage Service Hello! I am Sijie Guo - PMC Chair of Apache BookKeeper Co-creator of Apache DistributedLog Twitter Messaging/Pub-Sub Team Yahoo! R&D
More informationSparkStreaming. Large scale near- realtime stream processing. Tathagata Das (TD) UC Berkeley UC BERKELEY
SparkStreaming Large scale near- realtime stream processing Tathagata Das (TD) UC Berkeley UC BERKELEY Motivation Many important applications must process large data streams at second- scale latencies
More informationThe Stream Processor as a Database. Ufuk
The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationData Acquisition. The reference Big Data stack
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini The reference
More information10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline
10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Copyright 2003 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4. Other Approaches
More informationSplunk & AWS. Gain real-time insights from your data at scale. Ray Zhu Product Manager, AWS Elias Haddad Product Manager, Splunk
Splunk & AWS Gain real-time insights from your data at scale Ray Zhu Product Manager, AWS Elias Haddad Product Manager, Splunk Forward-Looking Statements During the course of this presentation, we may
More informationTechnical Note. Abstract
Technical Note Dell PowerEdge Expandable RAID Controllers 5 and 6 Dell PowerVault MD1000 Disk Expansion Enclosure Solution for Microsoft SQL Server 2005 Always On Technologies Abstract This technical note
More informationKonstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu } Introduction } Architecture } File
More informationDatabricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes
Databricks Delta: Bringing Unprecedented Reliability and Performance to Cloud Data Lakes AN UNDER THE HOOD LOOK Databricks Delta, a component of the Databricks Unified Analytics Platform*, is a unified
More informationThe Technology of the Business Data Lake. Appendix
The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform
More informationBuilding loosely coupled and scalable systems using Event-Driven Architecture. Jonas Bonér Patrik Nordwall Andreas Källberg
Building loosely coupled and scalable systems using Event-Driven Architecture Jonas Bonér Patrik Nordwall Andreas Källberg Why is EDA Important for Scalability? What building blocks does EDA consists of?
More informationImplementing Replication. Overview of Replication Managing Publications and Subscriptions Configuring Replication in Some Common Scenarios
Implementing Replication Overview of Replication Managing Publications and Subscriptions Configuring Replication in Some Common Scenarios Lesson 1: Overview of Replication Distributing and Synchronizing
More informationIBM Active Cloud Engine/Active File Management. Kalyan Gunda
IBM Active Cloud Engine/Active File Management Kalyan Gunda kgunda@in.ibm.com Agenda Need of ACE? Inside ACE Use Cases Data Movement across sites How do you move Data across sites today? FTP, Parallel
More informationOracle Responsys. Release 18B. New Feature Summary ORACLE
Oracle Responsys Release 18B New Feature Summary ORACLE TABLE OF CONTENTS Revision History 4 Overview 4 APIs 4 New Throttling Limits for Web Services APIs 4 New Asynchronous Web Services APIs 5 New REST
More informationGriddable.io architecture
Griddable.io architecture Executive summary This whitepaper presents the architecture of griddable.io s smart grids for synchronized data integration. Smart transaction grids are a novel concept aimed
More informationManaging IoT and Time Series Data with Amazon ElastiCache for Redis
Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All
More informationFunctionality, Challenges and Architecture of Social Networks
Functionality, Challenges and Architecture of Social Networks INF 5370 Outline Social Network Services Functionality Business Model Current Architecture and Scalability Challenges Conclusion 1 Social Network
More informationTo Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016
To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016 Story Let s start with the story 2 First things to decide Before you decide how to shard you d best understand whether or not
More informationPASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year
PASS4TEST IT Certification Guaranteed, The Easy Way! \ http://www.pass4test.com We offer free update service for one year Exam : 0B0-105 Title : BEA8.1 Certified Architect:Enterprise Architecture Vendors
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationMicrosoft Perform Data Engineering on Microsoft Azure HDInsight.
Microsoft 70-775 Perform Data Engineering on Microsoft Azure HDInsight http://killexams.com/pass4sure/exam-detail/70-775 QUESTION: 30 You are building a security tracking solution in Apache Kafka to parse
More informationFrom Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019
From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways
More informationTSAR A TimeSeries AggregatoR. Anirudh Todi TSAR
TSAR A TimeSeries AggregatoR Anirudh Todi Twitter @anirudhtodi TSAR What is TSAR? What is TSAR? TSAR is a framework and service infrastructure for specifying, deploying and operating timeseries aggregation
More informationRevolutionizing the Datacenter Join the Conversation #OpenPOWERSummit
Redis Labs on POWER8 Server: The Promise of OpenPOWER Value Jeffrey L. Leeds, Ph.D. Vice President, Alliances & Channels Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Who We Are
More informationUNIVERSITY OF TORONTO FACULTY OF APPLIED SCIENCE AND ENGINEERING
UNIVERSITY OF TORONTO FACULTY OF APPLIED SCIENCE AND ENGINEERING ECE361 Computer Networks Midterm March 06, 2017, 6:15PM DURATION: 80 minutes Calculator Type: 2 (non-programmable calculators) Examiner:
More informationToday: Distributed Middleware. Middleware
Today: Distributed Middleware Middleware concepts Case study: CORBA Lecture 24, page 1 Middleware Software layer between application and the OS Provides useful services to the application Abstracts out
More informationStorm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter
Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and
More informationBUILDING MICROSERVICES ON AZURE. ~ Vaibhav
BUILDING MICROSERVICES ON AZURE ~ Vaibhav Gujral @vabgujral About Me Over 11 years of experience Working with Assurant Inc. Microsoft Certified Azure Architect MCSD, MCP, Microsoft Specialist Aspiring
More informationTwitter Adaptation Layer Submitted for Drexel University s CS544
Twitter Adaptation Layer Submitted for Drexel University s CS544 Josh Datko www.datko.net 9 June 2012 1 Description of Service The Twitter Adaptation Layer (TWAL) provides connected, best-effort-end-to-end
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationMigrating massive monitoring to Bigtable without downtime. Martin Parm, Infrastructure Engineer for Monitoring
Migrating massive monitoring to Bigtable without downtime Martin Parm, Infrastructure Engineer for Monitoring This is a big deal. -- Nicholas Harteau/VP, Engineering & Infrastructure https://news.spotify.com/dk/2016/02/23/announcing-spotify-infrastructures-googley-future/
More informationImproving efficiency of Twitter Infrastructure using Chargeback
Improving efficiency of Twitter Infrastructure using Chargeback @vinucharanya @micheal AGENDA Brief History Problem Chargeback Engineering Challenges The product Impact Future Getty Images from http://www.fifa.com/worldcup/news/y=2010/m=7/news=pride-for-africa-spain-strike-gold-2247372.html
More informationfor Multi-Services Gateways
KURA an OSGi-basedApplication Framework for Multi-Services Gateways Introduction & Technical Overview Pierre Pitiot Grenoble 19 février 2014 Multi-Service Gateway Approach ESF / Increasing Value / Minimizing
More information1z0-479 oracle. Number: 1z0-479 Passing Score: 800 Time Limit: 120 min.
1z0-479 oracle Number: 1z0-479 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 What is the role of a user data store in Oracle Identity Federation (OIF) 11g when it is configured as an Identity
More informationTechnical Note. Dell/EMC Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract
Technical Note Dell/EMC Solutions for Microsoft SQL Server 2005 Always On Technologies Abstract This technical note provides information on the Dell/EMC storage solutions, based on the Microsoft SQL Server
More informationBuilding a Data-Friendly Platform for a Data- Driven Future
Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami BENJAMIN HINDMAN Co-founder and Chief Architect of Mesosphere,
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationSmart Client Offline Data Caching and Synchronization
Smart Client Offline Data Caching and Synchronization Brian Noyes Principal Software Architect IDesign,, Inc. www.idesign.net Offline Operations Challenges 1 What is a Smart Client Rich user interface
More informationWhich compute option is designed for the above scenario? A. OpenWhisk B. Containers C. Virtual Servers D. Cloud Foundry
1. A developer needs to create support for a workload that is stateless and short-living. The workload can be any one of the following: - API/microservice /web application implementation - Mobile backend
More informationTungsten Replicator for Kafka, Elasticsearch, Cassandra
Tungsten Replicator for Kafka, Elasticsearch, Cassandra Topics In todays session Replicator Basics Filtering and Glue Kafka and Options Elasticsearch and Options Cassandra Future Direction 2 Asynchronous
More informationBringing Data to Life
Bringing Data to Life Data management and Visualization Techniques Benika Hall Rob Harrison Corporate Model Risk March 16, 2018 Introduction Benika Hall Analytic Consultant Wells Fargo - Corporate Model
More informationDeep Learning Inference as a Service
Deep Learning Inference as a Service Mohammad Babaeizadeh Hadi Hashemi Chris Cai Advisor: Prof Roy H. Campbell Use case 1: Model Developer Use case 1: Model Developer Inference Service Use case
More informationEverything You Need to Know About MySQL Group Replication
Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
More informationDocument Sub Title. Yotpo. Technical Overview 07/18/ Yotpo
Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time
More informationCSCI 466 Midterm Networks Fall 2013
CSCI 466 Midterm Networks Fall 2013 Name: This exam consists of 6 problems on the following 7 pages. You may use your single-sided hand-written 8 ½ x 11 note sheet and a calculator during the exam. No
More information<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store
Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb The following is intended to outline our general product direction. It is intended for information purposes only,
More informationOracle 10g and IPv6 IPv6 Summit 11 December 2003
Oracle 10g and IPv6 IPv6 Summit 11 December 2003 Marshal Presser Principal Enterprise Architect Oracle Corporation Agenda Oracle Distributed Computing Role of Networking IPv6 Support Plans Early IPv6 Implementations
More informationEvolution of an Apache Spark Architecture for Processing Game Data
Evolution of an Apache Spark Architecture for Processing Game Data Nick Afshartous WB Analytics Platform May 17 th 2017 May 17 th, 2017 About Me nafshartous@wbgames.com WB Analytics Core Platform Lead
More informationData Infrastructure at LinkedIn. Shirshanka Das XLDB 2011
Data Infrastructure at LinkedIn Shirshanka Das XLDB 2011 1 Me UCLA Ph.D. 2005 (Distributed protocols in content delivery networks) PayPal (Web frameworks and Session Stores) Yahoo! (Serving Infrastructure,
More informationIntra-cluster Replication for Apache Kafka. Jun Rao
Intra-cluster Replication for Apache Kafka Jun Rao About myself Engineer at LinkedIn since 2010 Worked on Apache Kafka and Cassandra Database researcher at IBM Outline Overview of Kafka Kafka architecture
More informationDATA INTEGRATION PLATFORM CLOUD. Experience Powerful Data Integration in the Cloud
DATA INTEGRATION PLATFORM CLOUD Experience Powerful Integration in the Want a unified, powerful, data-driven solution for all your data integration needs? Oracle Integration simplifies your data integration
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationCloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018
Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster
More informationCloud-Native Applications. Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0
Cloud-Native Applications Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0 Cloud-Native Characteristics Lean Form a hypothesis, build just enough to validate or disprove it. Learn
More informationSpark Streaming. Guido Salvaneschi
Spark Streaming Guido Salvaneschi 1 Spark Streaming Framework for large scale stream processing Scales to 100s of nodes Can achieve second scale latencies Integrates with Spark s batch and interactive
More informationSurviving congestion in geo-distributed storage systems
Surviving congestion in geo-distributed storage systems Brian Cho Marcos K. Aguilera University of Illinois at Urbana-Champaign Microsoft Research Silicon Valley Geo-distributed data centers Web applications
More informationGoing Serverless. Building Production Applications Without Managing Infrastructure
Going Serverless Building Production Applications Without Managing Infrastructure Objectives of this talk Outline what serverless means Discuss AWS Lambda and its considerations Delve into common application
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationReal-time data processing with Apache Flink
Real-time data processing with Apache Flink Gyula Fóra gyfora@apache.org Flink committer Swedish ICT Stream processing Data stream: Infinite sequence of data arriving in a continuous fashion. Stream processing:
More informationDiscretized Streams. An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
Discretized Streams An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker, Ion Stoica UC BERKELEY Motivation Many important
More informationChapter 4 Communication
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 4 Communication Layered Protocols (1) Figure 4-1. Layers, interfaces, and protocols in the OSI
More informationBuild, Deploy & Operate Intelligent Chatbots with Amazon Lex
Build, Deploy & Operate Intelligent Chatbots with Amazon Lex Ian Massingham AWS Technical Evangelist @IanMmmm aws.amazon.com/lex 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
More informationSpark, Shark and Spark Streaming Introduction
Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn? New Batch Details
We are ready to serve Latest Testing Trends, Are you ready to learn? START DATE : New Batch Details TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : SOAP UI, SOA Testing, API Testing,
More informationTalend Big Data Sandbox. Big Data Insights Cookbook
Overview Pre-requisites Setup & Configuration Hadoop Distribution Download Demo (Scenario) Overview Pre-requisites Setup & Configuration Hadoop Distribution Demo (Scenario) About this cookbook What is
More informationManaging Copy Services
This chapter contains the following sections: Copy Services, page 1 Consistency Groups, page 10 Copy Services Both IBM Storwize and IBM SAN Volume Controllers provide Copy Services functions that enable
More informationDatabases suck for Messaging
Databases suck for Messaging Alexis Richardson Oxford Geek Night May 2009 1 Computers were meant to get rid of this 2 A new kind of fail? 3 Solution - use a database? 4 Databases were meant to get rid
More informationSTORM AND LOW-LATENCY PROCESSING.
STORM AND LOW-LATENCY PROCESSING Low latency processing Similar to data stream processing, but with a twist Data is streaming into the system (from a database, or a netk stream, or an HDFS file, or ) We
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationBefore proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors.
About the Tutorial Storm was originally created by Nathan Marz and team at BackType. BackType is a social analytics company. Later, Storm was acquired and open-sourced by Twitter. In a short time, Apache
More informationSecurent Entitlement Management Solution. v 3.1 GA. PDP and PEP Cache Clustering. September Part No. PDPPEPCACHE-31GA-1
Securent Entitlement Management Solution v 3.1 GA PDP and PEP Cache Clustering September 2007 Part No. PDPPEPCACHE-31GA-1 Copyright Copyright 2006-2007 Securent, Inc. All Rights Reserved. Restricted Rights
More informationDistributed Systems. Tutorial 9 Windows Azure Storage
Distributed Systems Tutorial 9 Windows Azure Storage written by Alex Libov Based on SOSP 2011 presentation winter semester, 2011-2012 Windows Azure Storage (WAS) A scalable cloud storage system In production
More informationImprove WordPress performance with caching and deferred execution of code. Danilo Ercoli Software Engineer
Improve WordPress performance with caching and deferred execution of code Danilo Ercoli Software Engineer http://daniloercoli.com Agenda PHP Caching WordPress Page Caching WordPress Object Caching Deferred
More informationIntroduc)on to Apache Ka1a. Jun Rao Co- founder of Confluent
Introduc)on to Apache Ka1a Jun Rao Co- founder of Confluent Agenda Why people use Ka1a Technical overview of Ka1a What s coming What s Apache Ka1a Distributed, high throughput pub/sub system Ka1a Usage
More informationCPSC 441 COMPUTER COMMUNICATIONS MIDTERM EXAM
CPSC 441 COMPUTER COMMUNICATIONS MIDTERM EXAM Department of Computer Science University of Calgary Professor: Carey Williamson November 1, 2005 This is a CLOSED BOOK exam. Textbooks, notes, laptops, personal
More informationHow can you implement this through a script that a scheduling daemon runs daily on the application servers?
You ve been tasked with implementing an automated data backup solution for your application servers that run on Amazon EC2 with Amazon EBS volumes. You want to use a distributed data store for your backups
More informationResearch challenges in data-intensive computing The Stratosphere Project Apache Flink
Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS haridi@kth.se e2e-clouds.org Presented by: Seif Haridi May 2014 Research Areas Data-intensive
More informationITP 342 Mobile App Development. APIs
ITP 342 Mobile App Development APIs API Application Programming Interface (API) A specification intended to be used as an interface by software components to communicate with each other An API is usually
More informationCommunication. Overview
Communication Chapter 2 1 Overview Layered protocols Remote procedure call Remote object invocation Message-oriented communication Stream-oriented communication 2 Layered protocols Low-level layers Transport
More informationDistributed Systems COMP 212. Lecture 15 Othon Michail
Distributed Systems COMP 212 Lecture 15 Othon Michail RPC/RMI vs Messaging RPC/RMI great in hiding communication in DSs But in some cases they are inappropriate What happens if we cannot assume that the
More informationCS60021: Scalable Data Mining. Sourangshu Bhattacharya
CS60021: Scalable Data Mining Sourangshu Bhattacharya In this Lecture: Outline: HDFS Motivation HDFS User commands HDFS System architecture HDFS Implementation details Sourangshu Bhattacharya Computer
More information! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like
Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total
More informationSearch Engines and Time Series Databases
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Search Engines and Time Series Databases Corso di Sistemi e Architetture per Big Data A.A. 2017/18
More information