SQL++ Query Language 9/9/2015. Semistructured Databases have Arrived. in the form of JSON. and its use in the FORWARD framework
|
|
- Lora Jenkins
- 6 years ago
- Views:
Transcription
1 Query Language and its use in the FORWARD framework with the support of National Science Foundation, Google, Informatica, Couchbase Semistructured Databases have Arrived SQL-on-Hadoop NoSQL PIG CQL SQL-on-Hadoop Jaql Others N1QL AQL in the form of JSON location: 'Alpine', readings: time: timestamp(' t20:00:00'), ozone: 0.035, no2: , time: timestamp(' t22:00:00'), ozone: 'm', co: 0.4 1
2 Semistructured Query Languages come along SQL: Expressive Power and Adoption JSON: Semistructured Data but in a Tower of Babel SELECT AVG(temp) AS tavg FROM readings GROUP BY sid SQL db.readings.aggregate( $group: _id: "$sid", tavg: $avg:"$temp") MongoDB readings -> group by sid = $.sid into tavg: avg($.temp) ; Jaql for $r in collection("readings") group by $r.sid return tavg: avg($r.temp) JSONiq a = LOAD 'readings' AS (sid:int, temp:float); b = GROUP a BY sid; c = FOREACH b GENERATE AVG(temp); DUMP c; Pig with limited query abilities Growing out of key-value databases or SQL with special type UDFs Far from the full-fledged power of prior non-relational query languages OQL, Xquery *academic projects being the exception (eg, ASTERIX, FORWARD) 2
3 in lieu of formal semantics Query languages were not meant for Ninjas Outline What is? How we use in UCSD s FORWARD? Introduction to the formal survey of NoSQL, NewSQL and SQL-on-Hadoop : Data model + query language for querying both structured data (SQL) and semistructured data (native JSON) Formal Schemaless Semantics: adjusting tuple calculus to the richness of JSON Configurable : Formal configuration options towards choosing semantics, understanding repercussions Full fledged, declarative, aggressively backwardscompatible with SQL Unifying: 11 popular databases/qls captured by appropriate settings of Configurable 3
4 How is it used in the Big Data industry ASTERIXDB CouchDB heading towards How do we use it in UCSD Automatic Incremental Maintenance Middleware Query Optimization & Execution: FORWARD virtual database query plans based on algebra based on a schemaless algebra Data Model (also JSON++ data model) A superset of SQL and JSON Extend SQL with arrays + nesting + heterogeneity location: 'Alpine', readings: time: timestamp(' t20:00:00'), ozone: 0.035, no2: , time: timestamp(' t22:00:00'), ozone: 'm', co: 0.4 Array nested inside a tuple Heterogeneous tuples in collections Arbitrary compositions of array, bag, tuple 4
5 Data Model (also JSON++ data model) A superset of SQL and JSON Extend SQL with arrays + nesting + heterogeneity + permissive structures 5, 10, 3, 21, 2, 15, 6, 4, lo: 3, exp: 4, hi: 7, 2, 13, 6 Arbitrary compositions of array, bag, tuple Data Model (also JSON++ data model) A superset of SQL and JSON Extend SQL with arrays + nesting + heterogeneity + permissive structures Extend JSON with bags and enriched types location: 'Alpine', readings: time: timestamp(' t20:00:00'), ozone: 0.035, no2: , time: timestamp(' t22:00:00'), ozone: 'm', co: 0.4 Bags Enriched types Exercise: Why the data model is a superset of the SQL data model? a SQL table corresponds to a JSON bag that has homogeneous tuples 5
6 Data Model (also JSON++ data model) A superset of SQL and JSON Extend SQL with arrays + nesting + heterogeneity + permissive structures + permissive schema Extend JSON with bags and enriched types location: 'Alpine', readings: time: timestamp(' t20:00:00'), time: timestamp, ozone: 0.035, ozone: any, no2: *, time: timestamp(' t22:00:00'), ozone: 'm', co: 0.4 location: string, readings: With schema Or, schemaless Or, partial schema From SQL to Goal: Queries that input/output any JSON type, yet backwards compatible with SQL Backwards Compatibility with SQL readings : sid: 2, temp: 70.1, sid: 2, temp: 49.2, sid: 1, temp: null SELECT DISTINCT r.sid FROM readings AS r WHERE r.temp < 50 sid : 2 sid temp null Find sensors that recorded a temperature below 50 sid 2 6
7 From SQL to Goal: Queries that input/output any JSON type, yet backwards compatible with SQL Adjust SQL syntax/semantics for heterogeneity, complex input/output values Achieved with few extensions and mostly by SQL limitation removals From Tuple Variables to Element Variables readings : 1.3, 0.7, 0.3, 0.8 SELECT r AS co FROM readings AS r WHERE r < 1.0 ORDER BY r DESC LIMIT 2 Find the highest two sensor readings that are below 1.0 co: 0.8, co: 0.7 Element Variables readings : 1.3, 0.7, 0.3, 0.8 FROM readings AS r B out FROM = B in WHERE = r : 1.3, r : 0.7, r : 0.3, r : 0.8 WHERE r < 1.0 B out WHERE = B in ORDERBY = r : 0.7, r : 0.3, r : 0.8 ORDER BY r DESC LIMIT 2 B out ORDERBY = B in LIMIT = r : 0.8, r : 0.7, r : 0.3 B out LIMIT = B in SELECT = r : 0.8 r : 0.7 SELECT r AS co co: 0.8, co: 0.7 7
8 Correlated Subqueries Everywhere sensors : 1.3, 2 0.7, 0.7, , 0.8, , 1.4 SELECT r AS co FROM sensors AS s, s AS r WHERE r < 1.0 ORDER BY r DESC LIMIT 2 Find the highest two sensor readings that are below 1.0 co: 0.9, co: 0.8 Correlated Subqueries Everywhere FROM sensors AS s, s AS r sensors : 1.3, 2 0.7, 0.7, , 0.8, , 1.4 B out FROM = B in WHERE = s : 1.3, 2, r : 1.3, s : 1.3, 2, r : 2, s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 0.7, 0.9, r : 0.9, s : 0.3, 0.8, 1.1, r : 0.3, s : 0.3, 0.8, 1.1, r : 0.8, s : 0.3, 0.8, 1.1, r : 1.1, s : 0.7, 1.4, r : 0.7, s : 0.7, 1.4, r : 1.4 WHERE r < 1.0 ORDER BY r DESC LIMIT 2 SELECT r AS co Correlated Subqueries Everywhere sensors : 1.3, 2 0.7, 0.7, , 0.8, , 1.4 FROM sensors AS s, s AS r WHERE r < 1.0 B out WHERE = B in ORDERBY = s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 0.7, 0.9, r : 0.9, s : 0.3, 0.8, 1.1, r : 0.3, s : 0.3, 0.8, 1.1, r : 0.8, s : 0.7, 1.4, r : 0.7 ORDER BY r DESC LIMIT 2 SELECT r AS co 8
9 Correlated Subqueries Everywhere sensors : 1.3, 2 0.7, 0.7, , 0.8, , 1.4 FROM sensors AS s, s AS r WHERE r < 1.0 ORDER BY r DESC LIMIT 2 B out ORDERBY = B in LIMIT = s : 0.7, 0.7, 0.9, r : 0.9, s : 0.3, 0.8, 1.1, r : 0.8, s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 0.7, 0.9, r : 0.7, s : 0.7, 1.4, r : 0.7, s : 0.3, 0.8, 1.1, r : 0.3 B out LIMIT = B in SELECT = s : 0.7, 0.7, 0.9, r : 0.9, s : 0.3, 0.8, 1.1, r : 0.8 SELECT r AS co co: 0.9, co: 0.8 Utilization of Input Order y x sensors : 1.3, 2 0.7, 0.7, , 0.8, , 1.4 SELECT r AS co, rp AS x, sp AS y FROM sensors AS s AT sp, s AS r AT rp WHERE r < 1.0 ORDER BY r DESC LIMIT 2 Find the highest two sensor readings that are below 1.0 co: 0.9, x: 3, y: 2, co: 0.8, x: 2, y: 3 Constructing Nested Results Subqueries Again Γ 0 = sensors : sensor: 1, sensor: 2, logs: sensor: 1, co: 0.4, sensor: 1, co: 0.2, sensor: 2, co: 0.3, FROM sensors AS s B out FROM = B in SELECT = s : sensor: 1, s : sensor: 2 SELECT s.sensor AS sensor, ( SELECT l.co AS co FROM logs AS l WHERE l.sensor = s.sensor ) AS readings sensor : 1, readings: co:0.4, co:0.2, sensor : 2, readings: co:0.3 9
10 Constructing Nested Results Γ 1 = s : sensor : 1, sensors : sensor: 1, sensor: 2, logs: sensor: 1, co: 0.4, sensor: 1, co: 0.2, sensor: 2, co: 0.3, FROM logs AS l B out FROM = B in SELECT = l : sensor: 1, co: 0.4, l : sensor: 1, co: 0.2, l : sensor: 2, co: 0.3 WHERE l.sensor = s.sensor B out FROM = B in SELECT = l : sensor: 1, co: 0.4, l : sensor: 1, co: 0.2 SELECT l.co AS co co:0.4, co:0.2 Iterating over Attribute-Value pairs INNER Vs OUTER Correlation: Capturing Outerjoins, OUTER FLATTEN Γ = readings : co :, no2: 0.7, so2: 0.5, 2 FROM readings AS g:v, v AS n g : "no2", v : 0.7, n : 0.7, g : "so2", v : 0.5, 2, n : 0.5, g : "co", v : 0.5, 2, n : 2 SELECT ELEMENT gas: g, val: v gas: "co", num: null, gas: "no2", num: 0.7, gas: "so2", num: 0.5,, gas: "so2", num: 2 10
11 INNER Vs OUTER Correlation: Capturing Outerjoins, OUTER FLATTEN Γ = readings : co :, no2: 0.7, so2: 0.5, 2 FROM readings AS g:v INNER CORRELATE v AS n g : "no2", v : 0.7, n : 0.7, g : "so2", v : 0.5, 2, n : 0.5, g : "co", v : 0.5, 2, n : 2 SELECT ELEMENT gas: g, val: v gas: "co", num: null, gas: "no2", num: 0.7, gas: "so2", num: 0.5,, gas: "so2", num: 2 INNER Vs OUTER Correlation: Capturing Outerjoins, OUTER FLATTEN Γ = readings : co :, no2: 0.7, so2: 0.5, 2 FROM readings AS g:v OUTER CORRELATE v AS n g : "co", v :, n : missing, g : "no2", v : 0.7, n : 0.7, g : "so2", v : 0.5, 2, n : 0.5, g : "co", v : 0.5, 2, n : 2 SELECT ELEMENT gas: g, val: v gas: "co", num: null, gas: "no2", num: 0.7, gas: "so2", num: 0.5,, gas: "so2", num: 2 The Extensions to SQL Goal: Queries that input/output any JSON type, yet backwards compatible with SQL Adjust SQL syntax/semantics for heterogeneity, complex values Achieved with few extensions and SQL limitation removals Element Variables Correlated Subqueries (in FROM and SELECT clauses) Optional utilization of Input Order Grouping indendent of Aggregation Functions... and a few more details Configurability: making a unifying and configurable language 11
12 Algebras do not need schemaful inputs Γ 1 = s : sensor : 1, sensors : sensor: 1, sensor: sensor 2 co, logs: sensor: 1, co: 2 0.4, 0.3 sensor: 1, co: 0.2, sensor: 2, co: 0.3, FROM logs AS l B out FROM = B in SELECT = l : sensor: 1, co: 0.4, l : sensor: 1, co: 0.2, l : sensor: 2, co: 0.3 WHERE l.sensor = s.sensor B out FROM = B in SELECT = l : sensor: 1, co: 0.4, l : sensor: 1, co: 0.2 SELECT l.co AS co co:0.4, co:0.2 Outline What is? How we use in UCSD s FORWARD? uses in integration Introduction to the formal survey of NoSQL, NewSQL and SQL-on-Hadoop based Virtual Database Client Federated Queries Results FORWARD Virtual Database Query Processor Virtual s of Sources Virtual SQL Wrapper Virtual NewSQL Wrapper Native queries Virtual NoSQL Wrapper Native results Virtual SQL-on- Hadoop Wrapper Virtual Java Objects Wrapper SQL Database NewSQL Database NoSQL Database SQL-on- Hadoop Database Java In-Memory Objects 12
13 Virtual and/or Materialized Client s Queries Results FORWARD Integration Database Definitions Integrated Query Processor Integrated s Integrated Added Value Virtual Virtual Virtual Virtual Virtual SQL Database NewSQL Database NoSQL Database SQL-on- Hadoop Database Java In-Memory Objects Use Case 1: Hide the ++ source features behind SQL views BI Tool SELECT / Report Writer m.sid, t AS temp FROM measurements AS m, SQL query SQL result m.temperatures AS t FORWARD Virtual Database measurements: SQL sid temp or its equivalent measurements: sid: 2, temp: 70.1, sid: 2, temp: 70.2, sid: 1, temp: 71.0 Couchbase query Couchbase results measurements: sid: 2, temperatures: 70.1, 70.2, sid: 1, temperatures: 71.0 Couchbase Use Case 2: Semantics-Aware Pushdown "Sensors that recorded a temperature below 50" SELECT DISTINCT m.sid FROM measurements AS m WHERE m.temp < 50 semantics for m.temp < $match: temp: $lt: 50, $match: temp: $not: null... MongoDB semantics for temp $lt 50 query MongoDB query result MongoDB Virtual Database Virtual (Identical) MongoDB Wrapper MongoDB results MongoDB measurements: sid: 2, temp: 70.1, sid: 2, temp: 49.2, sid: 1, temp: null 13
14 Push-Down Challenges: Limited capabilities & semantic variations How to simulate incompatible semantics/missing features? Sources like SQL and MongoDB cannot execute the entirety of How to efficiently push down computation? Issues automatically handled by the query processor. Plenty of query rewriting problems, including novel ones on semi-structured operators captures Semantics Variations Semantics of "less-than" comparison are different across sources: < sql, < mongodb, etc. Config Parameters to capture these MongoDB ( x < y complex : deep, type_mismatch: false, null_lt_null : false, null_lt_value: false,... ( x < y ) Semantics Variations In NoSQL, NewSQL, SQL-on-Hadoop variation of semantics for: Paths Equality Comparisons And all the operators that use them: Selections Grouping Ordering Set operations Each of these features has a set of config parameters in 14
15 Don t give a standard Teach how to reach a standard Fast evolving space Standard would be premature Configuration options itemize and explain the space options Rationalize discussion Advise on consistent settings of the configuration options Not every configuration option setting is sensible Configuration options have to subscribe to common sense iff x = y then x = y if x = y and y = z then x = z which enables rewritings Have to be consistent with each other Eg, if true AND missing is missing, then false OR missing is missing FORWARD: Incremental Maintenance and Application Visualization layer The Incremental Maintenance functionality: (Materialized) Definition Eg, Couchbase has a JSON web log showing user, list of displayed products and FORWARD produces materialized view product category, count, products: product, count Stream of inserts, deletes, updates on base data Before DB Before IVM Stream Stream The Incremental Maintenance module Automatically and efficiently updates the materialized view to reflect the stream of changes can also enable automatic Incremental Maintenance! With attention to replication of data in views Opportunities by keys After DB After 15
16 FORWARD: Incremental Maintenance and Application Visualization layer Custom dashboards, interactive pages & apps The data models of visualization components (e.g. Google Maps) can be nicely captured with JSON models The pages are (JSON) views! Mashups of the components views feeds and incrementally updates the page views Use case From data to visualization with just & markup Ajax/Javascript visuals with no Ajax/Javascript mess How to easily connect to today s JS libraries Custom Ajax visualizations & interfaces for IT personnel FORWARD: Incremental Maintenance and Application Visualization layer (part of) the Google Map model <% unit google.maps.maps %> markers: position: latitude : number, longitude: number... <% end unit %> Outline What is? How we use in UCSD s FORWARD? uses in integration and (live) analytics applications Introduction to the formal survey of NoSQL, NewSQL and SQLon-Hadoop 16
17 Surveyed Databases SQL-on-Hadoop NoSQL Hive Pig Jaql SQL AQL MongoJDBC CQL MongoDB JSONiq N1QL BigQuery (IBM) (AsterixDB) (Cassandra) (Couchbase) (Google) Algebraic FLWOR SQL-like Created JSON-like Based Previously standard on syntax at XQuery "Dremel" Facebook Hadoop Also Research MongoDB Most 28msec No SQL-like production comprises used clusters cluster(s) syntax database via NoSQL JDBC release NewSQL database (UCI) driver PIG CQL SQL-on-Hadoop Jaql Others N1QL SQL & NewSQL AQL MongoDB driver Removes Superficial Differences covers SQL, N1QL and QL research prototypes (e.g., UCI s ASTERIX) Removing the current Tower of Babel effect Providing formal syntax and semantics SELECT AVG(temp) AS tavg FROM readings GROUP BY sid SQL readings -> group by sid = $.sid into tavg: avg($.temp) ; Jaql for $r in collection("readings") group by $r.sid return tavg: avg($r.temp) JSONiq db.readings.aggregate( $group: _id: "$sid", tavg: $avg:"$temp") MongoDB a = LOAD 'readings' AS (sid:int, temp:float); b = GROUP a BY sid; c = FOREACH b GENERATE AVG(temp); DUMP c; Pig Surveyed features 15 feature matrices (1-11 dimensions each) classifying: Data values Schemas Access and construct nested data Missing information Equality semantics Ordering semantics Aggregation Joins Set operators Extensibility 17
18 Outline What is? How we use in UCSD s FORWARD? Introduction to the formal survey of NoSQL, NewSQL and SQLon-Hadoop Methodology Example 1: data model (data values) Example 2: query language (SELECT clause) Example 3: semantics (path) Example 4: semantics (equality function) Methodology For each feature: 1. A formal definition of the feature in 2. A example 3. A feature matrix that classifies each dimension of the feature 4. A discussion of the results, partial support and unexpected behaviors All the results are empirically validated Example: Data values 1. example: location: 'Alpine', readings: time: timestamp(' t20:00:00'), ozone: 0.035, no2: , time: timestamp(' t22:00:00'), ozone: 'm', co:
19 Example: Data values 2. BNF for values: Example: Data values 3. Feature matrix: Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Bag of tuples No Yes No No Partial Yes Yes Hive Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes Any Value Yes Yes Yes Partial Yes Yes Yes Example: Data values 4. Discussion of the results: Column-by-column comparison Partial support (65k scalar elements) Identify clusters (who supports JSON?) Composability (top-level values) Heterogeneity Arrays Bags Sets Maps Tuples Primitives Bag of tuples No Yes No No Partial Yes Yes Hive Jaql Any Value Yes Yes No No No Yes Yes Pig Bag of tuples Partial No Partial No Partial Yes Yes CQL Bag of tuples No Partial No Partial Partial No Yes JSONiq Any Value Yes Yes No No No Yes Yes MongoDB Bag of tuples Yes Yes No No No Yes Yes N1QL Bag of tuples Yes Yes No No No Yes Yes SQL Bag of tuples No No No No No No Yes AQL Any Value Yes Yes Yes No No Yes Yes BigQuery Bag of tuples No No No No No Yes Yes MongoJDBC Bag of tuples Yes Yes No No No Yes Yes Any Value Yes Yes Yes Partial Yes Yes Yes 19
20 Example: SELECT clause 1. example: Projecting nested collections: SELECT TUPLE s.lat, s.long, (SELECT r.ozone FROM readings AS r WHERE r.location = s.location) FROM sensors AS s "Position and (nested) ozone readings of each sensor?" Projecting non-tuples: SELECT ELEMENT ozone FROM readings "Bag of all the (scalar) ozone readings?" Example: SELECT clause 3. Feature matrix: Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No Yes Yes Example: SELECT clause 4. Discussion of the results: Not well supported features 3 languages support them entirely (same cluster as for data values) Projecting tuples containing nested collections Projecting non-tuples Hive Partial No Jaql Yes Yes Pig Partial No CQL No No JSONiq Yes Yes MongoDB Partial Partial N1QL Partial Partial SQL No No AQL Yes Yes BigQuery No No MongoJDBC No No Yes Yes 20
21 Config Parameters We use config parameters to encompass and compare various semantics of a feature: Minimal number of independent dimensions 1 dimension = 1 config parameter formalism parametrized by the config parameters Feature matrix classifies the values of each config parameter Example: Paths Config parameters for tuple absent : missing, type_mismatch: error, ( x.y ) Example: Paths The feature matrix classifies, for each language, the value of each config parameter: Missing Type mismatch Hive Error Error Jaql Null Error# Pig Error Error CQL Error Error JSONiq Missing# Missing# MongoDB Missing Missing N1QL Missing Missing SQL Error Error AQL Null Error BigQuery Error Error MongoJDBC 21
22 Example: Equality Function Config parameters for complex : error, type_mismatch : false, null_eq_null : null, null_eq_value : null, missing_eq_missing: missing, missing_eq_value : missing, missing_eq_null : missing ( x = y ) Example: Equality Function The feature matrix classifies, for each language, the value of each config parameter: No real cluster Some languages have multiple (incompatible) equality functions Some edge cases cannot happen due to other limitations (SQL has no complex values) Complex Type mismatch Null = Null Null = Value Missing = Missing Missing = Value Missing = Null Hive (=, <=>) Err, Err Err, Err Null, True Null, False N/A N/A N/A Jaql Boolean Null Null Null N/A N/A N/A Pig Boolean partial Err Null Null N/A N/A N/A CQL Err Err Err False/Null N/A N/A N/A JSONiq Boolean Err, False True, True False, False Missing, True Missing, False Missing, False (=, deep-equal()) Err, MongoDB Boolean False True False True False False N1QL Boolean False Null Null Missing Missing Missing SQL N/A Err Null Null N/A N/A N/A AQL Err Err Null Null N/A N/A N/A BigQuery Err Err Null Null N/A N/A N/A MongoJDBC Boolean False True False N/A @equal How to use this survey? As a database user: Understand the semantics of a (often underspecified) query language / be aware of the limitation of a database As a designer/architect of a database Produce formal specification of your query language Align semantics with SQL's As a database researcher The results might change, but the survey methodology stays As a designer/architect of database middleware Understand what capability variations need to be encapsulated and simulated 22
23 The survey shows: The marketing clusters do not correspond to real capabilities Limited capabilities: matrices are sparse and fragmented (more pressure on source-specific rewriters and distributor) The Future is Semi-Structured and Declarative Scalability Flexibility of Semistructured Data Power of Declarative: Automatic Optimization, Maintenance The primary operational and the secondary analytics application out of semistructured, declarative platforms 23
The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases
Noname manuscript No. (will be inserted by the editor) The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases Kian Win Ong Yannis
More informationarxiv: v2 [cs.db] 5 Jun 2014
Noname manuscript No. (will be inserted by the editor) The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases Kian Win Ong Yannis
More informationarxiv: v3 [cs.db] 15 Aug 2014
Noname manuscript No. (will be inserted by the editor) The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases Kian Win Ong Yannis
More informationAnnouncements. JSon Data Structures. JSon Syntax. JSon Semantics: a Tree! JSon Primitive Datatypes. Introduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 13: Json and SQL++ Announcements HW5 + WQ5 will be out tomorrow Both due in 1 week Midterm in class on Friday, 5/4 Covers everything (HW, WQ, lectures,
More informationNOSQL DATABASE SYSTEMS: DATA MODELING. Big Data Technologies: NoSQL DBMS (Data Modeling) - SoSe
NOSQL DATABASE SYSTEMS: DATA MODELING Big Data Technologies: NoSQL DBMS (Data Modeling) - SoSe 2017 24 Data Modeling Object-relational impedance mismatch Example: orders, order lines, customers (with different
More informationJSON - Overview JSon Terminology
Announcements Introduction to Database Systems CSE 414 Lecture 12: Json and SQL++ Office hours changes this week Check schedule HW 4 due next Tuesday Start early WQ 4 due tomorrow 1 2 JSON - Overview JSon
More informationCompile-Time Code Generation for Embedded Data-Intensive Query Languages
Compile-Time Code Generation for Embedded Data-Intensive Query Languages Leonidas Fegaras University of Texas at Arlington http://lambda.uta.edu/ Outline Emerging DISC (Data-Intensive Scalable Computing)
More informationCSE 344 APRIL 16 TH SEMI-STRUCTURED DATA
CSE 344 APRIL 16 TH SEMI-STRUCTURED DATA ADMINISTRATIVE MINUTIAE HW3 due Wednesday OQ4 due Wednesday HW4 out Wednesday (Datalog) Exam May 9th 9:30-10:20 WHERE WE ARE So far we have studied the relational
More informationThe Pig Experience. A. Gates et al., VLDB 2009
The Pig Experience A. Gates et al., VLDB 2009 Why not Map-Reduce? Does not directly support complex N-Step dataflows All operations have to be expressed using MR primitives Lacks explicit support for processing
More informationOral Questions and Answers (DBMS LAB) Questions & Answers- DBMS
Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database
More information10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 11: NoSQL & JSON (mostly not in textbook only Ch 11.1) HW5 will be posted on Friday and due on Nov. 14, 11pm [No Web Quiz 5] Today s lecture: NoSQL & JSON
More informationBeyond Hive Pig and Python
Beyond Hive Pig and Python What is Pig? Pig performs a series of transformations to data relations based on Pig Latin statements Relations are loaded using schema on read semantics to project table structure
More informationCSE 344 MAY 7 TH EXAM REVIEW
CSE 344 MAY 7 TH EXAM REVIEW EXAMINATION STATIONS Exam Wednesday 9:30-10:20 One sheet of notes, front and back Practice solutions out after class Good luck! EXAM LENGTH Production v. Verification Practice
More informationData Formats and APIs
Data Formats and APIs Mike Carey mjcarey@ics.uci.edu 0 Announcements Keep watching the course wiki page (especially its attachments): https://grape.ics.uci.edu/wiki/asterix/wiki/stats170ab-2018 Ditto for
More informationHadoop is supplemented by an ecosystem of open source projects IBM Corporation. How to Analyze Large Data Sets in Hadoop
Hadoop Open Source Projects Hadoop is supplemented by an ecosystem of open source projects Oozie 25 How to Analyze Large Data Sets in Hadoop Although the Hadoop framework is implemented in Java, MapReduce
More information5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 15: NoSQL & JSON (mostly not in textbook only Ch 11.1) 1 Homework 4 due tomorrow night [No Web Quiz 5] Midterm grading hopefully finished tonight post online
More informationUniform Query Framework for Relational and NoSQL Databases
Copyright 2017 Tech Science Press CMES, vol.113, no.2, pp.171-181, 2017 Uniform Query Framework for Relational and NoSQL Databases J.B. Karanjekar 1 and M.B. Chandak 2 Abstract: As the data managed by
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More informationSources. P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley
Big Data and NoSQL Sources P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley Very short history of DBMSs The seventies: IMS end of the sixties, built for the Apollo program (today: Version 15)
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationJaql. Kevin Beyer, Vuk Ercegovac, Eugene Shekita, Jun Rao, Ning Li, Sandeep Tata. IBM Almaden Research Center
Jaql Running Pipes in the Clouds Kevin Beyer, Vuk Ercegovac, Eugene Shekita, Jun Rao, Ning Li, Sandeep Tata IBM Almaden Research Center http://code.google.com/p/jaql/ 2009 IBM Corporation Motivating Scenarios
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationCourse Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:
Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course: 20762C Developing SQL 2016 Databases Module 1: An Introduction to Database Development Introduction to the
More informationQuerying Data with Transact SQL
Course 20761A: Querying Data with Transact SQL Course details Course Outline Module 1: Introduction to Microsoft SQL Server 2016 This module introduces SQL Server, the versions of SQL Server, including
More informationHadoop Development Introduction
Hadoop Development Introduction What is Bigdata? Evolution of Bigdata Types of Data and their Significance Need for Bigdata Analytics Why Bigdata with Hadoop? History of Hadoop Why Hadoop is in demand
More informationWriting Analytical Queries for Business Intelligence
MOC-55232 Writing Analytical Queries for Business Intelligence 3 Days Overview About this Microsoft SQL Server 2016 Training Course This three-day instructor led Microsoft SQL Server 2016 Training Course
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationModule 4. Implementation of XQuery. Part 0: Background on relational query processing
Module 4 Implementation of XQuery Part 0: Background on relational query processing The Data Management Universe Lecture Part I Lecture Part 2 2 What does a Database System do? Input: SQL statement Output:
More informationQUERYING MICROSOFT SQL SERVER COURSE OUTLINE. Course: 20461C; Duration: 5 Days; Instructor-led
CENTER OF KNOWLEDGE, PATH TO SUCCESS Website: QUERYING MICROSOFT SQL SERVER Course: 20461C; Duration: 5 Days; Instructor-led WHAT YOU WILL LEARN This 5-day instructor led course provides students with
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationToday s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls
Today s topics CompSci 516 Data Intensive Computing Systems Lecture 4 Relational Algebra and Relational Calculus Instructor: Sudeepa Roy Finish NULLs and Views in SQL from Lecture 3 Relational Algebra
More informationIntroduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data
Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction
More informationResearch challenges in data-intensive computing The Stratosphere Project Apache Flink
Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS haridi@kth.se e2e-clouds.org Presented by: Seif Haridi May 2014 Research Areas Data-intensive
More information5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 16: NoSQL and JSon Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5 Today s lecture: JSon The book covers
More informationDatabase Systems CSE 414
Database Systems CSE 414 Lecture 16: NoSQL and JSon CSE 414 - Spring 2016 1 Announcements Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5] Today s lecture:
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationPig A language for data processing in Hadoop
Pig A language for data processing in Hadoop Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Apache Pig: Introduction Tool for querying data on Hadoop
More informationBig Data Analytics using Apache Hadoop and Spark with Scala
Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important
More informationDATABASE DESIGN II - 1DL400
DATABASE DESIGN II - 1DL400 Fall 2016 A second course in database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationDocument stores using CouchDB
2018 Document stores using CouchDB ADVANCED DATABASE PROJECT APARNA KHIRE, MINGRUI DONG aparna.khire@vub.be, mingdong@ulb.ac.be 1 Table of Contents 1. Introduction... 3 2. Background... 3 2.1 NoSQL Database...
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info
We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423
More informationHadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera
More informationCOSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 416 NoSQL Databases NoSQL Databases Overview Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Databases Brought Back to Life!!! Image copyright: www.dragoart.com Image
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationAsterixDB. A Scalable Open Source DBMS. This presentation is based on slides made by Michael J. Carey, Chen Li, and Vassilis Tsotras 11/28/2018 1
AsterixDB A Scalable Open Source DBMS This presentation is based on slides made by Michael J. Carey, Chen Li, and Vassilis Tsotras 1 Big Data / Web Warehousing So what went on and why? What s going on
More informationThe power of PostgreSQL exposed with automatically generated API endpoints. Sylvain Verly Coderbunker 2016Postgres 中国用户大会 Postgres Conference China 20
The power of PostgreSQL exposed with automatically generated API endpoints. Sylvain Verly Coderbunker Development actors Frontend developer Backend developer Database administrator System administrator
More informationRelational Algebra. Study Chapter Comp 521 Files and Databases Fall
Relational Algebra Study Chapter 4.1-4.2 Comp 521 Files and Databases Fall 2010 1 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model
More informationMapReduce and Friends
MapReduce and Friends Craig C. Douglas University of Wyoming with thanks to Mookwon Seo Why was it invented? MapReduce is a mergesort for large distributed memory computers. It was the basis for a web
More informationSimplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration
Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration Howard Ho IBM Almaden Research Center Take Home Messages Clio (Schema Mapping for EII) has evolved to Clio 2010
More information745: Advanced Database Systems
745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.
More informationQuerying Microsoft SQL Server
Querying Microsoft SQL Server 20461D; 5 days, Instructor-led Course Description This 5-day instructor led course provides students with the technical skills required to write basic Transact SQL queries
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationCertified Big Data Hadoop and Spark Scala Course Curriculum
Certified Big Data Hadoop and Spark Scala Course Curriculum The Certified Big Data Hadoop and Spark Scala course by DataFlair is a perfect blend of indepth theoretical knowledge and strong practical skills
More informationMTA Database Administrator Fundamentals Course
MTA Database Administrator Fundamentals Course Session 1 Section A: Database Tables Tables Representing Data with Tables SQL Server Management Studio Section B: Database Relationships Flat File Databases
More informationCOSC 304 Introduction to Database Systems. NoSQL Databases. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 304 Introduction to Database Systems NoSQL Databases Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Databases Relational databases are the dominant form
More informationIntroduction to NoSQL by William McKnight
Introduction to NoSQL by William McKnight All rights reserved. Reproduction in whole or part prohibited except by written permission. Product and company names mentioned herein may be trademarks of their
More informationT-SQL Training: T-SQL for SQL Server for Developers
Duration: 3 days T-SQL Training Overview T-SQL for SQL Server for Developers training teaches developers all the Transact-SQL skills they need to develop queries and views, and manipulate data in a SQL
More informationHacking PostgreSQL Internals to Solve Data Access Problems
Hacking PostgreSQL Internals to Solve Data Access Problems Sadayuki Furuhashi Treasure Data, Inc. Founder & Software Architect A little about me... > Sadayuki Furuhashi > github/twitter: @frsyuki > Treasure
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationRethinkDB. Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu
RethinkDB Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu Content Introduction System Features Data Model ReQL Applications Introduction Niharika Vithala What is a NoSQL Database Databases that
More informationOracle GoldenGate for Big Data
Oracle GoldenGate for Big Data The Oracle GoldenGate for Big Data 12c product streams transactional data into big data systems in real time, without impacting the performance of source systems. It streamlines
More informationColumn-Family Stores: Cassandra
NDBI040: Big Data Management and NoSQL Databases h p://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-ndbi040/ Lecture 10 Column-Family Stores: Cassandra Mar n Svoboda svoboda@ksi.mff.cuni.cz 13. 12. 2016
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationPROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.
PROFESSIONAL NoSQL Shashank Tiwari WILEY John Wiley & Sons, Inc. Examining CONTENTS INTRODUCTION xvil CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit
More informationNew Developments in Spark
New Developments in Spark And Rethinking APIs for Big Data Matei Zaharia and many others What is Spark? Unified computing engine for big data apps > Batch, streaming and interactive Collection of high-level
More informationQuery Languages for Document Stores
Query Languages for Document Stores NoSQL matters conference 2013-04-26 Jan Steemann me I'm a software developer working at triagens GmbH on and with Documents Documents documents are self-contained, aggregate
More information1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions
Big Data Hadoop Architect Online Training (Big Data Hadoop + Apache Spark & Scala+ MongoDB Developer And Administrator + Apache Cassandra + Impala Training + Apache Kafka + Apache Storm) 1 Big Data Hadoop
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream Processing Topics Model Issues System Issues Distributed Processing Web-Scale Streaming 3 Data Streams Continuous
More informationRajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10
Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 RAJIV GANDHI COLLEGE OF ENGINEERING & TECHNOLOGY, KIRUMAMPAKKAM-607 402 DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK
More informationQuerying Data with Transact-SQL
Course 20761A: Querying Data with Transact-SQL Page 1 of 5 Querying Data with Transact-SQL Course 20761A: 2 days; Instructor-Led Introduction The main purpose of this 2 day instructor led course is to
More informationAdvanced Data Management Technologies Written Exam
Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This
More information4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)
Since in the result relation each group is represented by exactly one tuple, in the select clause only aggregate functions can appear, or attributes that are used for grouping, i.e., that are also used
More informationQuerying Data with Transact-SQL
Querying Data with Transact-SQL Course: 20761 Course Details Audience(s): IT Professional(s) Technology: Microsoft SQL Server 2016 Duration: 24 HRs. ABOUT THIS COURSE This course is designed to introduce
More informationCSE 344 JULY 9 TH NOSQL
CSE 344 JULY 9 TH NOSQL ADMINISTRATIVE MINUTIAE HW3 due Wednesday tests released actual_time should have 0s not NULLs upload new data file or use UPDATE to change 0 ~> NULL Extra OOs on Mondays 5-7pm in
More informationIntroduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos
Instituto Politécnico de Tomar Introduction to Big Data NoSQL Databases Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in
More informationQuerying Microsoft SQL Server 2012/2014
Page 1 of 14 Overview This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL queries for Microsoft SQL Server 2014. This course is the foundation
More informationInnovatus Technologies
HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String
More informationCSE 344 FEBRUARY 2 ND DATA SEMI-STRUCTURED
CSE 344 FEBRUARY 2 ND DATA SEMI-STRUCTURED ADMINISTRATIVE MINUTIAE HW3 due Tonight, 11:30 pm OQ4 Due Wednesday, 11:00 pm HW4 due Friday 11:30 pm Exam next Friday 3:30-5:00 WHERE WE ARE So far we have studied
More informationAVANTUS TRAINING PTE LTD
[MS20461]: Querying Microsoft SQL Server 2014 Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : SQL Server Delivery Method : Instructor-led (Classroom) Course Overview This 5-day
More informationCertified Big Data and Hadoop Course Curriculum
Certified Big Data and Hadoop Course Curriculum The Certified Big Data and Hadoop course by DataFlair is a perfect blend of in-depth theoretical knowledge and strong practical skills via implementation
More informationNOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe
NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS h_da Prof. Dr. Uta Störl Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe 2017 163 Performance / Benchmarks Traditional database benchmarks
More informationUsing ElasticSearch to Enable Stronger Query Support in Cassandra
Using ElasticSearch to Enable Stronger Query Support in Cassandra www.impetus.com Introduction Relational Databases have been in use for decades, but with the advent of big data, there is a need to use
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationSQL in the Hybrid World
SQL in the Hybrid World Tanel Poder a long time computer performance geek 1 Tanel Põder Intro: About me Oracle Database Performance geek (18+ years) Exadata Performance geek Linux Performance geek Hadoop
More informationQuerying Microsoft SQL Server
Querying Microsoft SQL Server Course 20461D 5 Days Instructor-led, Hands-on Course Description This 5-day instructor led course is designed for customers who are interested in learning SQL Server 2012,
More informationQuerying Data with Transact SQL Microsoft Official Curriculum (MOC 20761)
Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761) Course Length: 3 days Course Delivery: Traditional Classroom Online Live MOC on Demand Course Overview The main purpose of this
More informationIntroduction to NoSQL Databases
Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction
More informationTeradata SQL Features Overview Version
Table of Contents Teradata SQL Features Overview Version 14.10.0 Module 0 - Introduction Course Objectives... 0-4 Course Description... 0-6 Course Content... 0-8 Module 1 - Teradata Studio Features Optimize
More informationShark: Hive (SQL) on Spark
Shark: Hive (SQL) on Spark Reynold Xin UC Berkeley AMP Camp Aug 21, 2012 UC BERKELEY SELECT page_name, SUM(page_views) views FROM wikistats GROUP BY page_name ORDER BY views DESC LIMIT 10; Stage 0: Map-Shuffle-Reduce
More informationBig Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours
Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals
More informationIntroduction to BigData, Hadoop:-
Introduction to BigData, Hadoop:- Big Data Introduction: Hadoop Introduction What is Hadoop? Why Hadoop? Hadoop History. Different types of Components in Hadoop? HDFS, MapReduce, PIG, Hive, SQOOP, HBASE,
More informationProgress DataDirect For Business Intelligence And Analytics Vendors
Progress DataDirect For Business Intelligence And Analytics Vendors DATA SHEET FEATURES: Direction connection to a variety of SaaS and on-premises data sources via Progress DataDirect Hybrid Data Pipeline
More informationMiddleware Layer for Replication between Relational and Document Databases
Middleware Layer for Replication between Relational and Document Databases Susantha Bathige and P.P.G Dinesh Asanka Abstract Effective and efficient data management is key to today s competitive business
More informationYahoo! manages more than 42,000 Hadoop nodes holding 200 PBs. more than 22 organizations are running PB-scale Hadoop clusters
An Optimization Framework for Map-Reduce Queries Leonidas Fegaras Chengkai Li Upa Gupta University of Texas at Arlington http://lambda.uta.edu/mrql/ Alternatives to MR Compared to a parallel RDB, the MR
More informationBigInsights and Cognos Stefan Hubertus, Principal Solution Specialist Cognos Wilfried Hoge, IT Architect Big Data IBM Corporation
BigInsights and Cognos Stefan Hubertus, Principal Solution Specialist Cognos Wilfried Hoge, IT Architect Big Data 2013 IBM Corporation A Big Data architecture evolves from a traditional BI architecture
More informationPhysical Database Design and Tuning. Chapter 20
Physical Database Design and Tuning Chapter 20 Introduction We will be talking at length about database design Conceptual Schema: info to capture, tables, columns, views, etc. Physical Schema: indexes,
More information