Graph and Timeseries Databases
|
|
- Lucas Harry Elliott
- 5 years ago
- Views:
Transcription
1 Graph and Timeseries Databases Roman Kern ISDS, TU Graz Roman Kern (ISDS, TU Graz) Dbase / 31
2 Graph Databases Graph Databases Motivation and Basics of Graph Databases? Roman Kern (ISDS, TU Graz) Dbase / 31
3 Graph Databases Introduction - Graph Databases What is a graph database? Datastorage optimised for graph data structure Ie efficient storage and access Scales gracefully with the amount of data Additional index for look-ups Roman Kern (ISDS, TU Graz) Dbase / 31
4 Graph Databases Introduction - Graph Databases Why should graph databases work? Networks usually have certain properties Small world phenomena even in big networks only a few hops are on average required to reach even distant nodes Access to data follows certain patterns Locality of reference operations are focused on certain areas Roman Kern (ISDS, TU Graz) Dbase / 31
5 Graph Databases Introduction - Graph Databases Why should one not use a graph database? if all the data is updated at once Eg operation applied on all nodes if the query cannot easily be expressed as a graph traversal operation Eg lot of random access, or aggregate functions Roman Kern (ISDS, TU Graz) Dbase / 31
6 Graph Databases Introduction Graph database vs relational database In principle a graph database can be implemented via a relational database using joins or consecutive queries But, relational databases are not optimised for such graph models ie lot of sparse (semi-empty) rows Additionally, relational databases are not designed for changes in the schema, eg dynamic types of relations Roman Kern (ISDS, TU Graz) Dbase / 31
7 Graph Databases Introduction Graph database vs relational database Figure: Comparison of import time for 2 graph databases vs a relational DB Roman Kern (ISDS, TU Graz) Dbase / 31
8 Graph Databases Introduction - Graph Databases Main concepts Many contemporary graph databases are based on property graphs Ie each node and edge are associated with a set of key/values where edges are directed (and often carry a label) Often support ACID properties Ie each modification takes place within a transaction Roman Kern (ISDS, TU Graz) Dbase / 31
9 Graph Databases Introduction - Graph Databases Query types for graph databases Lookup of nodes Traversal of a graph Start at a node continue following edges until a stopping criteria has been reached Breadth-first vs depth-first Path finding Find a path between two nodes (eg Dijkstra, A*) Path matching Matching patterns in graph Roman Kern (ISDS, TU Graz) Dbase / 31
10 Modelling of Graph Databases Modelling of Graph Databases How to represent the data and how to model Roman Kern (ISDS, TU Graz) Dbase / 31
11 Modelling of Graph Databases Modelling of Graph Databases Approach How should a graph database schema look like? Ie how is the data represented as nodes and edges Many different ways of modelling Graphs are a very flexible data structure capable of capturing many domains models In many cases a direct mapping is possible Domain model and graph model Need to review the model Validate that the graph is suited for the queries being used Eg don t mix entities with relations Roman Kern (ISDS, TU Graz) Dbase / 31
12 Modelling of Graph Databases Modelling of Graph Databases How to model for graph databases Roman Kern (ISDS, TU Graz) Dbase / 31
13 Application of Graph Databases Application Practical aspects of graph databases? Roman Kern (ISDS, TU Graz) Dbase / 31
14 Application of Graph Databases Application of Graph Databases Main software tools for graph databases Neo4j OrientDB TitanDB and many others Roman Kern (ISDS, TU Graz) Dbase / 31
15 Application of Graph Databases Application of Graph Databases Panama Papers - Intro Leaked documents from a firm in Panama (26TB of data) about offshore activities Journalist (around the world) were working on analysing the data a graph database was the back-end of this activities Neo4j (plus Apache Solr and Tika) Roman Kern (ISDS, TU Graz) Dbase / 31
16 Application of Graph Databases Application of Graph Databases Panama Papers - Steps Populate the database Analyse the documents eg entity extraction (detect names) entities in the graph Entity types: company, officer, client, address Extract meta-data of documents properties for nodes Detect relationships eg using the connection of sender/receiver of s Refinement of graph Manual work conducted by the journalists More entity types, eg money flow, document types Roman Kern (ISDS, TU Graz) Dbase / 31
17 Application of Graph Databases Application of Graph Databases Roman Kern (ISDS, TU Graz) Dbase / 31
18 Time Series Databases Time Series Databases Motivation and Basics of Time Series Databases? Roman Kern (ISDS, TU Graz) Dbase / 31
19 Time Series Databases Introduction Time Series Databases What is a time series database? Data storage optimised for temporal data Endless stream of incoming data Often accompanied by Tools to acquire time series data Tools to visualise and analyse such data Roman Kern (ISDS, TU Graz) Dbase / 31
20 Time Series Databases Introduction Time Series Databases Typical characteristics of time series databases Fast write/append operations Slow update/delete operations Scales well to huge amount of data Retention policy, ie to forget old data Access restrictions Provide (SQL-like) query languages Optimised for time range queries Specialised queries for aggregates Often rely on other storage mechanism eg key-value store, wide-column storage Roman Kern (ISDS, TU Graz) Dbase / 31
21 Time Series Databases Introduction Time Series Databases Sources of time series Observations, eg weather data, CO2 Stock exchange data Sensor data Log files Roman Kern (ISDS, TU Graz) Dbase / 31
22 Time Series Databases Introduction Time Series Databases Types of time series databases Limited type of payload Eg limited to just timestamp + number least amount of memory needed Flexible payload Allows for richer representation Eg timestamp + document Wide-tables Each row consists of many columns often hundreds of columns sparse rows Roman Kern (ISDS, TU Graz) Dbase / 31
23 Modelling of Time Series Databases Modelling of Time Series Databases How to represent the data and how to model Roman Kern (ISDS, TU Graz) Dbase / 31
24 Modelling of Time Series Databases Modelling of Time Series Databases Approach Often based on single samples (observations) Single vs multiple Eg sensor readouts of multiple sensors (temperature, air pressure) Example: Mesaurement consists of Timestamp, metric name, value, list of filters Eg 10:32, cpu-usage, 087, host=examplecom, cpu=01 Flat file Generic vs specific Store the name of the time series with each observation (generic) Needed in case of dynamic systems eg different sensors become available or disappear Have dedicated time series (specific) Roman Kern (ISDS, TU Graz) Dbase / 31
25 Modelling of Time Series Databases Modelling of Time Series Databases Approach Windowed storage Each row represent a time window Columns for a more fine grained resolution Typically between 100 and 1000 observations per row Alternatively, multiple observations are stored in a single columns Using a custom (compressed) format Special case: temporal and spatial data Requires specialised look-up methods Roman Kern (ISDS, TU Graz) Dbase / 31
26 Time Series Databases Example Example for Time Series Databases Practical Aspects of Time Series Databases? Roman Kern (ISDS, TU Graz) Dbase / 31
27 Time Series Databases Example Time Series Databases Example TICK Stack Collection of tools: Telegraf: server agent for collecting and reporting metrics (stream or batch processing) to write data into the DB InfluxDB: the time series database component Chronograf: Graphing and visualisation frontend for exploration Kapacitor: Data processing engine, can process stream and batch data https: //wwwinfluxdatacom/wp-content/themes/influx/images/tick-stackpng Roman Kern (ISDS, TU Graz) Dbase / 31
28 Time Series Databases Example Time Series Databases Example InfluxDB Features Tags Tags are indexed store commonly-queried meta data if GROUP BY should be used on the data Fields Fields are not indexed Everything that should not be stored as string If aggregation functions should be used on the data (COUNT, MAX, PERCENTILE, CUMSUM) Roman Kern (ISDS, TU Graz) Dbase / 31
29 Time Series Databases Example Time Series Databases Example Figure: Screenshot of example data stored in InfluxDB Roman Kern (ISDS, TU Graz) Dbase / 31
30 Time Series Databases Example The End Next: Map/Reduce Roman Kern (ISDS, TU Graz) Dbase / 31
Introduction to NoSQL Databases
Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction
More informationSearch Engines and Time Series Databases
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Search Engines and Time Series Databases Corso di Sistemi e Architetture per Big Data A.A. 2017/18
More informationSearch and Time Series Databases
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Search and Time Series Databases Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria
More informationInside the InfluxDB Storage Engine
Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb 1 2 What is time series data? 3 Stock trades and quotes 4 Metrics 5 Analytics 6 Events 7 Sensor data 8 Traces Two kinds
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationMIS Database Systems.
MIS 335 - Database Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query in a Database
More informationBIS Database Management Systems.
BIS 512 - Database Management Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query
More informationCIB Session 12th NoSQL Databases Structures
CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu HBase HBase is.. A distributed data store that can scale horizontally to 1,000s of commodity servers and petabytes of indexed storage. Designed to operate
More informationExecution Architecture
Execution Architecture Software Architecture VO (706.706) Roman Kern Institute for Interactive Systems and Data Science, TU Graz 2018-11-07 Roman Kern (ISDS, TU Graz) Execution Architecture 2018-11-07
More informationCrateDB for Time Series. How CrateDB compares to specialized time series data stores
CrateDB for Time Series How CrateDB compares to specialized time series data stores July 2017 The Time Series Data Workload IoT, digital business, cyber security, and other IT trends are increasing the
More informationNOSQL Databases and Neo4j
NOSQL Databases and Neo4j Database and DBMS Database - Organized collection of data The term database is correctly applied to the data and their supporting data structures. DBMS - Database Management System:
More information@InfluxDB. David Norton 1 / 69
@InfluxDB David Norton (@dgnorton) david@influxdb.com 1 / 69 Instrumenting a Data Center 2 / 69 3 / 69 4 / 69 The problem: Efficiently monitor hundreds or thousands of servers 5 / 69 The solution: Automate
More informationChronix A fast and efficient time series storage based on Apache Solr. Caution: Contains technical content.
Chronix A fast and efficient time series storage based on Apache Solr Caution: Contains technical content. 68.000.000.000* time correlated data objects. How to store such amount of data on your laptop
More informationCISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Document databases Graph databases Metadata Column databases
CISC 7610 Lecture 4 Approaches to multimedia databases Topics: Document databases Graph databases Metadata Column databases NoSQL architectures: different tradeoffs for different workloads Already seen:
More informationFlexible Network Analytics in the Cloud. Jon Dugan & Peter Murphy ESnet Software Engineering Group October 18, 2017 TechEx 2017, San Francisco
Flexible Network Analytics in the Cloud Jon Dugan & Peter Murphy ESnet Software Engineering Group October 18, 2017 TechEx 2017, San Francisco Introduction Harsh realities of network analytics netbeam Demo
More informationNew Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply
New Data Architectures For Netflow Analytics NANOG 74 Fangjin Yang - Cofounder @ Imply The Problem Comparing technologies Overview Operational analytic databases Try this at home The Problem Netflow data
More informationCSE 344 Final Review. August 16 th
CSE 344 Final Review August 16 th Final In class on Friday One sheet of notes, front and back cost formulas also provided Practice exam on web site Good luck! Primary Topics Parallel DBs parallel join
More informationUsing Prometheus with InfluxDB for metrics storage
Using Prometheus with InfluxDB for metrics storage Roman Vynar Senior Site Reliability Engineer, Quiq September 26, 2017 About Quiq Quiq is a messaging platform for customer service. https://goquiq.com
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationMonitoring InfluxCloud with InfluxDB, Grafana, Telegraf, and Kapacitor. Paul Dix CTO & cofounder of
Monitoring InfluxCloud with InfluxDB, Grafana, Telegraf, and Kapacitor Paul Dix CTO & cofounder of InfluxData @pauldix Who am I? What is InfluxCloud? Cost to monitor with SaaS > $10,000/month We do it
More informationDATABASE SYSTEMS. Database programming in a web environment. Database System Course, 2016
DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016 AGENDA FOR TODAY Advanced Mysql More than just SELECT Creating tables MySQL optimizations: Storage engines, indexing.
More informationProject: Relational Databases vs. NoSQL. Catherine Easdon
Project: Relational Databases vs. NoSQL Catherine Easdon Database Choice (Relational) PostgreSQL (using BigSQL tools) Open source Mature project 21 years old, release 10.1 Supports all major OSes Highly
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationNoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre
NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Leveraging the NoSQL boom 2 Why NoSQL? In the last fourty years relational databases have been the default choice for serious
More informationNoSQL Databases An efficient way to store and query heterogeneous astronomical data in DACE. Nicolas Buchschacher - University of Geneva - ADASS 2018
NoSQL Databases An efficient way to store and query heterogeneous astronomical data in DACE DACE https://dace.unige.ch Data and Analysis Center for Exoplanets. Facility to store, exchange and analyse data
More informationCISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Graph databases Neo4j syntax and examples Document databases
CISC 7610 Lecture 4 Approaches to multimedia databases Topics: Graph databases Neo4j syntax and examples Document databases NoSQL architectures: different tradeoffs for different workloads Already seen:
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationDATABASE SYSTEMS. Database programming in a web environment. Database System Course,
DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016-2017 AGENDA FOR TODAY The final project Advanced Mysql Database programming Recap: DB servers in the web Web programming
More informationBig Data Computing for GIS Data Discovery
Big Data Computing for GIS Data Discovery Solutions for Today Options for Tomorrow Vic Baker 1,2, Jennifer Bauer 1, Kelly Rose 1,Devin Justman 1,3 1 National Energy Technology Laboratory, 2 MATRIC, 3 AECOM
More informationMonitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino
Monitoring system for geographically distributed datacenters based on Openstack Gioacchino Vino Tutor: Dott. Domenico Elia Tutor: Dott. Giacinto Donvito Borsa di studio GARR Orio Carlini 2016-2017 INFN
More informationMotivation and basic concepts Storage Principle Query Principle Index Principle Implementation and Results Conclusion
JSON Schema-less into RDBMS Most of the material was taken from the Internet and the paper JSON data management: sup- porting schema-less development in RDBMS, Liu, Z.H., B. Hammerschmidt, and D. McMahon,
More informationArchitectural Styles I
Architectural Styles I Software Architecture VO/KU (707023/707024) Roman Kern KTI, TU Graz 2015-01-07 Roman Kern (KTI, TU Graz) Architectural Styles I 2015-01-07 1 / 86 Outline 1 Non-Functional Concepts
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationProduct Data Sheet: TimeBase
Product Data Sheet: Time Series Database is a high performance event-oriented time-series database engine and messaging middleware. is designed explicitly for very fast population and retrieval of massive
More informationApache Cassandra. Tips and tricks for Azure
Apache Cassandra Tips and tricks for Azure Agenda - 6 months in production Introduction to Cassandra Design and Test Getting ready for production The first 6 months 1 Quick introduction to Cassandra Client
More informationREADME file for TICKpy (CogSys) Container v0.9.4
README file for TICKpy (CogSys) Container v0.9.4 Container: TICKpy (CogSys) Container-Version: 0.9.4 Interface-Version: 2.0.0 Build-date: Wed Jun 27 12:09:08 UTC 2018 Maintainer: Oliver Beyer Support:
More informationCS November 2018
Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationDocument stores using CouchDB
2018 Document stores using CouchDB ADVANCED DATABASE PROJECT APARNA KHIRE, MINGRUI DONG aparna.khire@vub.be, mingdong@ulb.ac.be 1 Table of Contents 1. Introduction... 3 2. Background... 3 2.1 NoSQL Database...
More informationAdvanced ecommerce Monitoring one tool does it all
Advanced ecommerce Monitoring one tool does it all No ecommerce platform can be operated without a proper monitoring solution in place. In fact monitoring or analytics alone isn t enough. If you are serious
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationLink Analysis in the Cloud
Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)
More informationHandout 12 Data Warehousing and Analytics.
Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also
More informationIntroduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos
Instituto Politécnico de Tomar Introduction to Big Data NoSQL Databases Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in
More informationMassive Online Analysis - Storm,Spark
Massive Online Analysis - Storm,Spark presentation by R. Kishore Kumar Research Scholar Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Kharagpur-721302, India (R
More informationData pipelines with PostgreSQL & Kafka
Data pipelines with PostgreSQL & Kafka Oskari Saarenmaa PostgresConf US 2018 - Jersey City Agenda 1. Introduction 2. Data pipelines, old and new 3. Apache Kafka 4. Sample data pipeline with Kafka & PostgreSQL
More informationUsing PostgreSQL, Prometheus & Grafana for Storing, Analyzing and Visualizing Metrics
Using PostgreSQL, Prometheus & Grafana for Storing, Analyzing and Visualizing Metrics Erik Nordström, PhD Core Database Engineer hello@timescale.com github.com/timescale Why PostgreSQL? Reliable and familiar
More informationThe Art of Container Monitoring. Derek Chen
The Art of Container Monitoring Derek Chen 2016.9.22 About me DevOps Engineer at Trend Micro Agile transformation Micro service and cloud service Docker integration Monitoring system development Automate
More informationData Modeling with Neo4j. Stefan Armbruster, Neo Technology (slides from Michael Hunger)
Data Modeling with Neo4j Stefan Armbruster, Neo Technology (slides from Michael Hunger) 1 1 33 is a 44 NOSQL 55 Graph Database 66 A graph database... NO: not for charts & diagrams, or vector artwork YES:
More informationCourse Introduction & Foundational Concepts
Course Introduction & Foundational Concepts CPS 352: Database Systems Simon Miner Gordon College Last Revised: 8/30/12 Agenda Introductions Course Syllabus Databases Why What Terminology and Concepts Design
More informationWhat is a multi-model database and why use it?
What is a multi-model database and why use it? An When it comes to choosing the right technology for a new project, ongoing development or a full system upgrade, it can often be challenging to define the
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1 We
More informationVisualize Your Data With Grafana Percona Live Daniel Lee - Software Engineer at Grafana Labs
Visualize Your Data With Grafana Percona Live 2017 Daniel Lee - Software Engineer at Grafana Labs Daniel Lee Software Engineer at Grafana Labs Stockholm, Sweden @danlimerick on Twitter What is Grafana?
More informationBigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis
BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis Motivation Lots of (semi-)structured data at Google URLs: Contents, crawl metadata, links, anchors, pagerank,
More informationTime Series Live 2017
1 Time Series Schemas @Percona Live 2017 Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2
More informationRails on HBase. Zachary Pinter and Tony Hillerson RailsConf 2011
Rails on HBase Zachary Pinter and Tony Hillerson RailsConf 2011 What we will cover What is it? What are the tradeoffs that HBase makes? Why HBase is probably the wrong choice for your app Why HBase might
More informationKSN Radio Stack: Sun SPOT Symposium 2009 London.
Andreas Leppert pp Stephan Kessler Sven Meisinger g : Reliable Wireless Communication for Dataintensive Applications in Sensor Networks Sun SPOT Symposium 2009 London www.kit.edu Application in WSN? Targets
More informationHBase vs Neo4j. Technical overview. Name: Vladan Jovičić CR09 Advanced Scalable Data (Fall, 2017) Ecolé Normale Superiuere de Lyon
HBase vs Neo4j Technical overview Name: Vladan Jovičić CR09 Advanced Scalable Data (Fall, 2017) Ecolé Normale Superiuere de Lyon 12th October 2017 1 Contents 1 Introduction 3 2 Overview of HBase and Neo4j
More informationTime-Series Data in MongoDB on a Budget. Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018
Time-Series Data in MongoDB on a Budget Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018 TIME SERIES DATA in MongoDB on a Budget Click to add text
More informationMongoDB Schema Design for. David Murphy MongoDB Practice Manager - Percona
MongoDB Schema Design for the Click "Dynamic to edit Master Schema" title World style David Murphy MongoDB Practice Manager - Percona Who is this Person and What Does He Know? Former MongoDB Master Former
More informationRIPE NCC Routing Information Service (RIS)
RIPE NCC Routing Information Service (RIS) Overview Colin Petrie 14/12/2016 RON++ What is RIS? What is RIS? Worldwide network of BGP collectors Deployed at Internet Exchange Points - Including at AMS-IX
More informationCOMP 430 Intro. to Database Systems. Indexing
COMP 430 Intro. to Database Systems Indexing How does DB find records quickly? Various forms of indexing An index is automatically created for primary key. SQL gives us some control, so we should understand
More informationImplementation Architecture
Implementation Architecture Software Architecture VO/KU (707023/707024) Roman Kern ISDS, TU Graz 2017-11-15 Roman Kern (ISDS, TU Graz) Implementation Architecture 2017-11-15 1 / 54 Outline 1 Definition
More informationColumnStore Indexes. מה חדש ב- 2014?SQL Server.
ColumnStore Indexes מה חדש ב- 2014?SQL Server דודאי מאיר meir@valinor.co.il 3 Column vs. row store Row Store (Heap / B-Tree) Column Store (values compressed) ProductID OrderDate Cost ProductID OrderDate
More informationShark: SQL and Rich Analytics at Scale. Michael Xueyuan Han Ronny Hajoon Ko
Shark: SQL and Rich Analytics at Scale Michael Xueyuan Han Ronny Hajoon Ko What Are The Problems? Data volumes are expanding dramatically Why Is It Hard? Needs to scale out Managing hundreds of machines
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationAcquiring Big Data to Realize Business Value
Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways
More informationDatabases 2 (VU) ( / )
Databases 2 (VU) (706.711 / 707.030) MapReduce (Part 3) Mark Kröll ISDS, TU Graz Nov. 27, 2017 Mark Kröll (ISDS, TU Graz) MapReduce Nov. 27, 2017 1 / 42 Outline 1 Problems Suited for Map-Reduce 2 MapReduce:
More informationImporting and Exporting Data Between Hadoop and MySQL
Importing and Exporting Data Between Hadoop and MySQL + 1 About me Sarah Sproehnle Former MySQL instructor Joined Cloudera in March 2010 sarah@cloudera.com 2 What is Hadoop? An open-source framework for
More informationEfficient and Scalable Friend Recommendations
Efficient and Scalable Friend Recommendations Comparing Traditional and Graph-Processing Approaches Nicholas Tietz Software Engineer at GraphSQL nicholas@graphsql.com January 13, 2014 1 Introduction 2
More informationEternal Story on Temporary Objects
Eternal Story on Temporary Objects Dmitri V. Korotkevitch http://aboutsqlserver.com About Me 14+ years of experience working with Microsoft SQL Server Microsoft SQL Server MVP Microsoft Certified Master
More informationArchitectural Styles I
Architectural Styles I Software Architecture VO/KU (707.023/707.024) Denis Helic, Roman Kern KMI, TU Graz Nov 14, 2012 Denis Helic, Roman Kern (KMI, TU Graz) Architectural Styles I Nov 14, 2012 1 / 80
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationTime Series Storage with Apache Kudu (incubating)
Time Series Storage with Apache Kudu (incubating) Dan Burkert (Committer) dan@cloudera.com @danburkert Tweet about this talk: @getkudu or #kudu 1 Time Series machine metrics event logs sensor telemetry
More informationFLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM
FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM RECOMMENDATION AND JUSTIFACTION Executive Summary: VHB has been tasked by the Florida Department of Transportation District Five to design
More informationJure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah
Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks
More informationDesigning dashboards for performance. Reference deck
Designing dashboards for performance Reference deck Basic principles 1. Everything in moderation 2. If it isn t fast in database, it won t be fast in Tableau 3. If it isn t fast in desktop, it won t be
More informationBIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,
BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT
More informationMain-Memory Databases 1 / 25
1 / 25 Motivation Hardware trends Huge main memory capacity with complex access characteristics (Caches, NUMA) Many-core CPUs SIMD support in CPUs New CPU features (HTM) Also: Graphic cards, FPGAs, low
More informationDistributed Databases: SQL vs NoSQL
Distributed Databases: SQL vs NoSQL Seda Unal, Yuchen Zheng April 23, 2017 1 Introduction Distributed databases have become increasingly popular in the era of big data because of their advantages over
More informationDistributed Non-Relational Databases. Pelle Jakovits
Distributed Non-Relational Databases Pelle Jakovits Tartu, 7 December 2018 Outline Relational model NoSQL Movement Non-relational data models Key-value Document-oriented Column family Graph Non-relational
More informationThe EHRI GraphQL API IEEE Big Data Workshop on Computational Archival Science
The EHRI GraphQL API IEEE Big Data Workshop on Computational Archival Science 13/12/2017 Mike Bryant CONNECTING COLLECTIONS The EHRI Project The main objective of EHRI is to support the Holocaust research
More informationManaging IoT and Time Series Data with Amazon ElastiCache for Redis
Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All
More informationAdvanced Data Management
Advanced Data Management Medha Atre Office: KD-219 atrem@cse.iitk.ac.in Aug 11, 2016 Assignment-1 due on Aug 15 23:59 IST. Submission instructions will be posted by tomorrow, Friday Aug 12 on the course
More informationSAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less
SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less Dipl.- Inform. Volker Stöffler Volker.Stoeffler@DB-TecKnowledgy.info Public Agenda Introduction: What is SAP IQ - in a
More informationCSC630/COS781: Parallel & Distributed Computing
CSC630/COS781: Parallel & Distributed Computing Algorithm Design Chapter 3 (3.1-3.3) 1 Contents Preliminaries of parallel algorithm design Decomposition Task dependency Task dependency graph Granularity
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL
More informationWho we are: Database Research - Provenance, Integration, and more hot stuff. Boris Glavic. Department of Computer Science
Who we are: Database Research - Provenance, Integration, and more hot stuff Boris Glavic Department of Computer Science September 24, 2013 Hi, I am Boris Glavic, Assistant Professor Hi, I am Boris Glavic,
More informationBring Context To Your Machine Data With Hadoop, RDBMS & Splunk
Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may
More informationAxibase Time-Series Database. Non-relational database for storing and analyzing large volumes of metrics collected at high-frequency
Axibase Time-Series Database Non-relational database for storing and analyzing large volumes of metrics collected at high-frequency What is a Time-Series Database? A time series database (TSDB) is a software
More informationAbstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight
ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationYandex.Classifieds. Vadim Tsesko
YoctoDB @ Yandex.Classifieds Vadim Tsesko incubos@ About Backend infrastructure team Services Libraries Frameworks @ Yandex.Classifieds auto.ru auto.yandex.ru rabota.yandex.ru realty.yandex.ru travel.yandex.ru
More informationThe Future of the Realtime Web BETTER APIS WITH GRAPHQL. Josh
The Future of the Realtime Web BETTER APIS WITH GRAPHQL Josh Price @joshprice STEPPING STONES TO FP Language (Elixir) Strongly-Typed APIs (GraphQL) GRAPHQL WAS HERE? http://whiteafrican.com/2008/05/12/crossing-the-mapping-chasm/
More informationDEC Computer Technology LESSON 6: DATABASES AND WEB SEARCH ENGINES
DEC. 1-5 Computer Technology LESSON 6: DATABASES AND WEB SEARCH ENGINES Monday Overview of Databases A web search engine is a large database containing information about Web pages that have been registered
More informationSEEM4540 Open Systems for E-Commerce Lecture 04 Servers Setup and Content Management Systems
SEEM4540 Open Systems for E-Commerce Lecture 04 Servers Setup and Content Management Systems Prolog To show our e-commerce store, we need to have a web server. There are three ways to obtain a web server:
More informationBig data for big river science: data intensive tools, techniques, and projects at the USGS/Columbia Environmental Research Center
Big data for big river science: data intensive tools, techniques, and projects at the USGS/Columbia Environmental Research Center Ed Bulliner U.S. Geological Survey, Columbia Environmental Research Center
More informationChapter 24 NOSQL Databases and Big Data Storage Systems
Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More information