CSE 232A Graduate Database Systems

Size: px
Start display at page:

Download "CSE 232A Graduate Database Systems"

Transcription

1 CSE 232A Graduate Database Systems Fall 2018 Arun Kumar 1

2 What is this course about? Why take it? 2

3 1. IBM s Watson beats humans in Jeapordy!

4 How did Watson achieve that? 4

5 Watson devoured LOTS of data! 5

6 2. Structured data with Google search results 6

7 How does Google know that? 7

8 Google also devours LOTS of data! 8

9 3. Amazon s spot-on recommendations 9

10 How does Amazon know that? 10

11 You guessed it! LOTS and LOTS of data! Analysis 11

12 And innumerable traditional applications 12

13 13

14 Large-scale data management systems are the cornerstone of many digital applications, both modern and traditional 14

15 The Age of Big Data / Data Science 15

16 Data data everywhere, All the wallets did shrink! Data data everywhere, Nor any moment to think? 16

17 CSE 232A will get you thinking about the fundamentals of database systems 1. How are large structured datasets stored and organized? 2. How are queries handled? 3. How to make the system faster? 4. Deeper and more recent issues 17

18 And now for the (boring) logistics 18

19 Course Administrivia Lectures: MWF 1:00-1:50pm, CNTR 214 Attending ALL lectures is mandatory! Instructor: Arun Kumar; Office hours: Wed 2-3pm, 3218 EBU3b (CSE building) TAs: Chand Anand and Gurkanwal Singh Batra Piazza:

20 Prerequisites CSE 132A (or equivalent) is essential CSE 120 is also helpful For all other cases, the instructor with proper justification. A waiver can be considered. 20

21 Grading Midterm Exam 1: 20% Date: Friday, October 26, in-class Midterm Exam 1: 20% Date: Wednesday, November 21, in-class Final Exam: 60% (cumulative) Date: TBD, Room TBD (Optional/Extra Credit) 8 Paper Reviews: 8% 21

22 Course Outline 1. Storage How is large and file structured layout data stored? 2. Indexing, How are queries sorting, relational handled? operator implementation, and query processing 3. Query How to optimization make the system and parallelism faster? 4. Transactions; How to support data data integration integrity? 5. ML Recent for DB issues and DB and for trends ML 22

23 The primary focus will be the relational data model and Relational Database Management Systems (RDBMS) 23

24 Relational model in a nutshell Basically, Relation:Table :: Pilot:Driver (okay, a bit more) RatingID Rating Date UserID MovieID /27/ /20/ /02/ The model formalizes operations to manipulate relations 24

25 Relational model in a nutshell Invented by E. F. Codd in 1970s at IBM Research in CA 25

26 Relational DBMS in a nutshell A software system to implement the relational model, i.e., enable users to manage data stored as relations 26

27 Relational DBMS in a nutshell First RDBMSs: System R (IBM) and Ingres (Berkeley) in 1970s A rare photo of the original System R manual Mike Stonebraker won the Turing Award in 2015! 27

28 Relational DBMS in a nutshell RDBMS software is now a USD 40+ billions/year industry! Numerous open source RDBMSs also popular People still start companies about what are basically RDBMSs! 28

29 Course Textbook Prescribed: Database Management Systems 3 rd Edition Raghu Ramakrishnan and Johannes Gehrke Aka The Cow Book Which cow are you? Optional: Database Systems: The Complete Book by Garcia-Molina, Widom, and Ullman Big Data Integration by Dong and Srivastava 29

30 Course Schedule Week Topic 1 Introduction; Data Storage (Disks; Files; New Hardware) 2-3 Indexing (B+ Tree; Hash Indexes; Learned Indexes); Sorting 3-4 Relational Operator Implementations; Query Processing 4 Midterm Exam 1 on Friday, 10/26 5 Query Optimization; Materialized Views 6 Parallel RDBMSs and Dataflow Systems 7-8 Transaction Management; Concurrency Control 8 Midterm Exam 2 on Wednesday, 11/ Semi-structured Data; Data Integration and Cleaning 10 Advanced Topics: ML for DB and DB for ML 11 Final Exam on TBD 30

31 General Dos and Do NOTs Do: Raise your hand before asking questions during lectures Discuss papers in the reading list with peers and others Participate in class discussions; and on Piazza, if you like Use CSE232A: as subject prefix for all s to me/tas Do NOT: Plagiarize or share content for your paper reviews Harass, cut off, or be disrespectful to others in class Use as primary communication mechanism for doubts/questions instead of Office Hours Record or quote the instructor s anecdotes out of class! 31

32 Questions? Recap of Relational Model and SQL not covered in class; slides made available 32

33 Example Relational DB: Netflix! Ratings RatingID NumStars Timestamp UserID MovieID /27/ UserID Name Age JoinDate 79 Alice 23 01/10/13 Users MovieID Name ReleaseDate Director Movies 20 Inception 07/13/2010 Christopher Nolan 33

34 Surprise Review Questions! Q1. Which of the following is not a basic operation in relational algebra? A B C D [./ 34

35 Surprise Review Questions! R U M RID NumStars RateDate UID MID UID UName Age JoinDate MID MName Year Director Q2. What type of join is the following NOT an example of? R./ U A B C D Inner join Outer join Natural join Theta join 35

36 Surprise Review Questions! R U M RID NumStars RateDate UID MID UID UName Age JoinDate MID MName Year Director Q3. How many movies did Jim Cameron direct? A B C D Director= Jim Cameron ( COUNT ( ) (M)) ReleaseYear,COUNT ( )( Director= Jim Cameron (M)) COUNT ( )( Director= Jim Cameron (M./ R)) COUNT ( )( Director= Jim Cameron (M)) 36

37 Surprise Review Questions! R U M RID NumStars RateDate UID MID UID UName Age JoinDate MID MName Year Director Q4. Which of the following attributes is a primary key? A B C D U.UID R.UID R.MID M.Year 37

38 Surprise Review Questions! Q5. Which of the following SQL features do not have a counterpart operation in extended relational algebra? A B C D SELECT DISTINCT WHERE GROUP BY LIMIT 38

39 Surprise Review Questions! R RID NumStars RateDate UID MID /27/ /20/ /02/ /05/ /09/ Q6. What is the cardinality of this query s output? SELECT COUNT(DISTINCT MID) FROM R A 1 C 3 B 2 D 4 39

40 Surprise Review Questions! M MID MName Year Director Q7. Get the year and director of all movies from the last decade that have the term Avengers in their title. A B C D SELECT Year, Director from M WHERE Year >= 2008 AND MName LIKE %Avengers% SELECT Year, Director from M WHERE Year >= 2008 AND MName LIKE %Avengers SELECT Year, Director from M WHERE Year >= 2008 AND MName LIKE Avengers% SELECT Year, Director from M WHERE Year >= 2008 AND MName = Avengers 40

41 Surprise Review Questions! Ratings (R) RatingID Stars UserID MovieID Movies (M) MovieID Name Year Director Q8. Write an SQL query to get the number of 5-star ratings for each movie directed by Christopher Nolan SELECT R.MovieID, COUNT(R.Stars) AS NumHighRatings FROM Ratings R, Movies M WHERE M.Director = Christopher Nolan AND R.MovieID = M.MovieID AND R.Stars = 5 GROUP BY R.MovieID 41

42 Recap: Relational Model (Not covered in class) 42

43 Relational Model: Basic Terms What What are is a is Attributes? Domains? Relation? Arity? Ratings RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ Integers Real The mathematical Number A These glorified domains of things attributes table! for the attributes 43

44 Relational Model: Basic Terms Ratings What are Tuples? What is Cardinality? RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ Number These of things tuples 44

45 Referring to tuples : Two notations 1. Without using attribute names (positional/sequence) 2. Using attribute names (named/set) Ratings (R) RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ t[1] = 3.5 t.numstars =

46 Relational Model: Basic Terms What is Schema? Ratings RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ The relation name, and the name and logical descriptions of the attributes (including domains) Aka metadata 46

47 Relational Model: Basic Terms What is an Instance? Ratings RatingID NumStars Timestamp UserID MovieID RatingID 1 NumStars /27/15 Timestamp 79 UserID MovieID /20/15 06/27/ /02/15 07/10/ /08/ Instance 1 Instance 2 A given relation populated with a set of tuples (loose analogy: schema:instance::type:value in PL) 47

48 Relational Model: Basic Terms Ratings What is a Relational Database? RatingID NumStars Timestamp UserID MovieID /27/ UserID Name Age JoinDate 79 Alice 23 01/10/13 Users MovieID Name ReleaseDate Director Movies 20 Inception 07/13/2010 Christopher Nolan A collection of relations; similarly, schema vs. instance 48

49 Write Operations on a Relation Insert Add tuples to a relation Delete Remove tuples from a relation (typically based on predicate matches, e.g., NumStars <= 4.5 Modify Logically, deletes + inserts, but typically implemented as in-place updates to a relation instance 49

50 Read Operations on a Relation Select Select all tuples from Ratings with UserID == 19 Project Select only Director attribute from Movies Aggregate Select Average of all NumStars in Ratings And a few more formal algebraic operations 50

51 Recap: Relational Algebra 51

52 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 52

53 Select Ratings (R) RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ Example: Get all ratings with 4.0 or more stars Select Operator Selection condition/predicate 53

54 Select R RatingID NumStars Timestamp UserID MovieID R /27/ /20/ /02/ /05/ Schema preserved Subset of tuples (satisfying selection condition) RatingID NumStars Timestamp UserID MovieID /20/ /05/

55 Complex Selection Conditions R RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ R RatingID NumStars Timestamp UserID MovieID /27/

56 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 56

57 Project Ratings (R) RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ Example: Get all MovieIDs Projection list 57

58 Project R RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ R Schema reduced (to projection list) MovieID Tuple values deduplicated 20 (slightly different semantics in SQL) 16 58

59 Composition of Relational Ops R RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ Example: Get UserID and NumStars of ratings with less than 3 stars R UserID NumStars RelOp(Relation) gives a Relation 59

60 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 60

61 Rename Ratings (R) RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ Example: Rename Timestamp to RateDate RatingID,NumStars,RateDate,UserID,MovieID (R) C(2 >RateDate) (R) 61

62 Rename R R RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ C(2 >RateDate) (R) Instance preserved Schema modified RatingID NumStars RateDate UserID MovieID /27/ /20/ /02/ /05/

63 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 63

64 Cross Product UserID Name Age JoinDate 79 Alice 23 01/10/13 80 Bob 41 05/10/13 Users (U) Movies (M) MovieID Name ReleaseYear Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron Cartesian product (construct all pairs of tuples across tables) Schema of output concatenates the input schemas Be careful with attribute name conflicts! Use Rename op 64

65 Cross Product UserID Name Age JoinDate 79 Alice 23 01/10/13 80 Bob 41 05/10/13 Users (U) MovieID Name ReleaseYear Director Movies (M) 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron C(1 >U.Name (U) C(1 >M.Name (M) UserID U.Name Age JoinDate MovieID M.Name Release Year Director 79 Alice 23 01/10/13 20 Inception 2010 Christopher Nolan 79 Alice 23 01/10/13 16 Avatar 2009 Jim Cameron 80 Bob 41 05/10/13 20 Inception 2010 Christopher Nolan 80 Bob 41 05/10/13 16 Avatar 2009 Jim Cameron 65

66 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 66

67 Union R1 RatingID NumStars Timestamp UserID MovieID R2 Union of sets of tuples (instances) /27/ /20/ /02/ Inputs must have identical schema: Union-compatible RatingID NumStars Timestamp UserID MovieID /02/ /05/ /09/

68 Union RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ /09/

69 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 69

70 Set Difference R1 RatingID NumStars Timestamp UserID MovieID R2 Set difference of sets of tuples (instances) /27/ /20/ /02/ Inputs must be Union-compatible RatingID NumStars Timestamp UserID MovieID /02/ /05/ /09/

71 Set Difference R1 R2 RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ RatingID NumStars Timestamp UserID MovieID /02/ /05/ /09/ R = R1 R2 RatingID NumStars Timestamp UserID MovieID /27/ /20/

72 Basic Relational Operations Select Project Rename Cross Product (aka Cartesian Product) Set Operations: Union Set Difference 72

73 Derived and Other Relational Ops Set Operation: Intersection Join Group By Aggregate 73

74 Intersection R1 RatingID NumStars Timestamp UserID MovieID R2 Set intersection of sets of tuples (instances) /27/ /20/ /02/ Inputs must be Union-compatible RatingID NumStars Timestamp UserID MovieID /02/ /05/ /09/

75 Intersection R1 R2 RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ RatingID NumStars Timestamp UserID MovieID /02/ /05/ /09/ RatingID NumStars Timestamp UserID MovieID /02/

76 Derived and Other Relational Ops Set Operation: Intersection Join Group By Aggregate 76

77 Join R./ JoinCondition M Equivalent Select on Cross Product, but bypasses full X JoinCondition(R M) Perhaps the most intensively studied Rel Op! Several types of Joins: Natural Join and Equi-Join Condition Join (aka Theta Join) Semi-Join, Inner Join, Outer Join, Anti-Join, etc. 77

78 Natural Join Ratings (R) RatingID NumStars Timestamp UserID MovieID /27/ /20/ /02/ /05/ Movies (M) MovieID Name ReleaseYear Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron R./ M 78

79 Natural Join Rating ID R./ M NumS tars Timesta mp UserI D MovieI D Name Release Year Director /27/ Inception 2010 Christopher Nolan /20/ Inception 2010 Christopher Nolan /02/ Avatar 2009 Jim Cameron /05/ Avatar 2009 Jim Cameron Join attributes : attributes that determine matching tuples Have same name in both inputs! MovieID in R and M Implicit equality condition on join attributes If > 1 pair, implicit logical and of all equality terms Output schema concatenates input schemas But join attributes appear only once in output (Project) 79

80 Equi-Join Generalization of the Natural Join R./ EqualityCondition M Attribute names of join attributes need not be the same EqualityCondition is a general boolean expression (logical and, and/or or ) of terms with equality predicates only Join attributes from both R and M in output (no Project) Perhaps the most important and common type of Join! Lots of R&D on efficient implementations! 80

81 Equi-Join: Example T(J, K, P, Q) =R 1./ K=P R 2 R 1 (J,K) J K 10 x 20 y 30 x R 2 (P,Q) P Q x 4 y 9 x 8 T(J,K,P,Q) J K P Q 10 x x 4 20 y y 9 30 x x 8 10 x x 8 30 x x 4 81

82 (Primary) Key-Foreign Key Join A special kind of equi-join One of the join attributes is the (Primary) Key of an input relation; the other is a Foreign Key in the other relation Ratings RatingID NumStars Timestamp UID MovieID UserID Name Age JoinDate Users Ratings./ UID=UserID Users Also a common and important (sub) type of join with even more specialized efficient implementations! 82

83 Condition Join (aka Theta Join) Generalization of the Equi-Join R./ JoinCondition M 83

84 Condition Join: Example T(J, K, P, Q) =R 1./ J/2>Q R 2 R 1 (J,K) R 2 (P,Q) T(J,K,P,Q) J K P Q 10 x x 4 20 y y 9 30 x x 12 Perhaps the most difficult type of Join to implement efficiently! J K P Q 10 x x 4 20 y x 4 20 y y 9 30 x x 4 30 x y 9 30 x x 12 84

85 Join Expressions Can compose many joins into a single complex expression Ratings RatingID NumStars Timestamp UserID MovieID UserID UName Age JoinDate Users MovieID Name ReleaseYear Director Movies AllStuff Q. What do we get as the output? UserI D UNa me Ag e JoinDat e RatingI D NumSta rs Timesta mp MovieI D Name ReleaseY Ear Directo r 85

86 Taxonomy of Joins All kinds of joins Outer joins Semi joins Anti joins Inner joins Theta joins Equi joins Key-Foreign Key Joins Snowflake joins Star joins Natural joins 86

87 Derived and Other Relational Ops Set Operation: Intersection Join Group By Aggregate 87

88 Group By Aggregate NOT a part of relational algebra, but Extended RA! Useful for analytics queries that aggregate numerical data Ratings RatingID NumStars Timestamp UserID MovieID What is the average rating for each movie? How many movies has each user rated? Standard 5 numerical aggregations supported in SQL: Count, Sum, Average, Maximum, and Minimum Extra: Median, Mode, Variance, Standard Deviation, etc. 88

89 Group By Aggregate X,Agg(Y )(R) Grouping Attributes A numerical attribute in R (Subset of R s attributes) Aggregate Function (SUM, COUNT, AVG, MAX, MIN) Output schema will have X and an extra numerical attribute (result of the aggregate function) Can list multiple aggregate functions in the same operation 89

90 Group By Aggregate R RatingID NumStars Timestamp UserID MovieID R /27/ /20/ /02/ /05/ What is the average rating for each movie? MovieID,AV G(NumStars)(R) MovieID AVG(NumStars)

91 Group By Aggregate The set of Grouping Attributes can be empty too! UserID Name Age JoinDate 79 Alice 23 01/10/13 80 Bob 41 05/10/ Carol 19 08/09/ Dan 20 03/01/15 Users (U) U What is the average age of the users? AV G(Age)(U) AVG(Age)

92 Derived and Other Relational Ops Set Operation: Intersection Join Group By Aggregate 92

93 Recap: SQL (Not covered in class) 93

94 Basic Form of an SQL Query Optional List of attributes to project SELECT [DISTINCT] target-list FROM relation-list [WHERE condition] Selection/join condition (optional) List of relations (possibly with aliases ) 94

95 What does it mean logically? SELECT [DISTINCT] target-list FROM relation-list [WHERE condition] 1. Cross-product of relations in relation-list 2. If condition given, apply it to filter out tuples 3. Remove attributes not present in target-list 4. If DISTINCT given, deduplicate tuples in result The above is only a logical interpretation. It is NOT a plan an RDBMS would use in general to run an SQL query! 95

96 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen Example: Get the names of movies released in 2013 SELECT M.Name FROM Movies M WHERE M.Year = 2013 Name ( Y ear=2013 (M)) 96

97 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT M.Name FROM Movies M WHERE M.Year = 2013 Name Gravity Blue Jasmine 97

98 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT * FROM Movies M WHERE M.Year = 2013 MovieID Name Year Director 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen 98

99 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen Example: Get the names of movies from years other than 2013 SELECT M.Name FROM Movies M WHERE M.Year <> 2013 Name ( Y ear6=2013 (M)) 99

100 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen Example: For which years do we have movie data? SELECT M.Year FROM Movies M Y ear (M) 100

101 Example SQL Query Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT M.Year FROM Movies M Year SQL allows repetitions of tuples in a relation! Not the same semantics as RA s Project Called bag semantics vs. RA s set semantics 101

102 DISTINCT in SQL Movies (M) MovieID Name Year Director Year Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT DISTINCT M.Year FROM Movies M DISTINCT needed to achieve set semantics of RA s Project in SQL 102

103 Aliases in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT M.Name FROM Movies M WHERE M.Year = 2013 SELECT Name FROM Movies WHERE Year = 2013 Why bother with the alias? Not needed here! 103

104 Aliases in SQL Useful for Joins! Movies (M) MovieID Name Year DirectorID Directors (D) DID Name Age Example: Get names of movies directed by Jim Cameron SELECT M.Name FROM Movies M, Directors D WHERE D.Name = Jim Cameron AND M.DirectorID = D.DID Aliases help disambiguate attributes with the same name from multiple relations (or even a self-join!) 104

105 More SQL Examples Movies (M) MovieID Name Year DirectorID Directors (D) DID Name Age Example: Get names of movies released in 2013 by Woody Allen or some other director 50 years or older SELECT M.Name FROM Movies M, Directors D WHERE (D.Name = Woody Allen OR D.Age >= 50) AND M.Year = 2013 AND M.DirectorID = D.DID 105

106 LIKE in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen Example: Get the directors of movies that start with Blue SELECT DISTINCT M.Director FROM Movies M WHERE M.Name LIKE Blue% % matches any number of characters; _ matches one 106

107 ORDER BY in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT M.Name FROM Movies M WHERE M.Year = 2013 ORDER BY M.Name Name Blue Gravity Jasmine Blue Gravity Jasmine Useful for data readability Ordering defined by domain semantics Can specify DESC; multiple attributes 107

108 LIMIT in SQL MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 74 Blue Jasmine 2013 Woody Allen SELECT M.Name FROM Movies M WHERE M.Year >= 2010 ORDER BY M.Year LIMIT 2 Year Inception Gravity Blue Jasmine Also useful for data readability Prevents flooding of screen with data Be wary of using it without ORDER BY! 108

109 Aggregate Functions in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 91 Interstellar 2014 Christopher Nolan How many movies came out after 2010? SELECT COUNT(*) FROM Movies M WHERE M.Year >

110 Aggregate Functions in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 91 Interstellar 2014 Christopher Nolan COUNT(*) How many movies came out after 2010? SELECT COUNT(*) FROM Movies M WHERE M.Year >

111 5 Native Aggregate Functions in SQL COUNT ([DISTINCT] attribute) AVG ([DISTINCT] attribute) SUM ([DISTINCT] attribute) MAX (attribute) MIN (attribute) 111

112 Aggregate Functions in SQL Movies (M) MovieID Name Year Director 20 Inception 2010 Christopher Nolan 16 Avatar 2009 Jim Cameron 53 Gravity 2013 Alfonso Cuaron 91 Interstellar 2014 Christopher Nolan SELECT FROM How many directors do we have? COUNT(DISTINCT M.Director) Movies M 112

113 Aggregate Functions in SQL Ratings (R) RatingID Stars UserID MovieID Which MovieID(s) have the highest rating? SELECT FROM R.MovieID, MAX(R.Stars) Ratings R Other attributes NOT allowed in the target-list as such! 113

114 Aggregate Functions in SQL SELECT FROM WHERE Ratings (R) RatingID Stars UserID MovieID Which MovieID(s) have the highest rating? DISTINCT R.MovieID Ratings R R.Stars = (SELECT MAX(R2.Stars) FROM Ratings R2) 114

115 Group By Aggregate in SQL X,Agg(Y )(R) SELECT [DISTINCT] target-list FROM relation-list [WHERE condition] GROUP BY grouping-list X HAVING group-condition target-list must be in this form: X, Agg(Y) Subset of X Condition on each group in aggregate 115

116 Group By Aggregate in SQL Ratings (R) RatingID Stars UserID MovieID What is the average rating for each movie? MovieID,AVG (Stars)(R) SELECT R.MovieID, AVG(R.Stars) AS AvgRating FROM Ratings R GROUP BY R.MovieID 116

117 Group By Aggregate in SQL Ratings (R) RatingID Stars UserID MovieID SELECT R.MovieID, AVG(R.Stars) AS AvgRating FROM Ratings R GROUP BY R.MovieID MovieID AvgRating One tuple in output per unique value of R.MovieID (aka group ) 117

118 Advanced SQL Operations (Optional) 118

119 UNION in SQL Ratings (R) RID Stars UID MID Movies (M) MID Name Year Director Get the IDs of users that have rated a movie directed by Ang Lee or a movie that released in 2013 SELECT R.UID FROM Ratings R, Movies M WHERE R.MID = M.MID AND M.Director = Ang Lee UNION SELECT R.UID Union-compatible! FROM Ratings R, Movies M WHERE R.MID = M.MID AND M.Year =

120 Semantics of UNION in SQL RID Stars UID MID MID Name Year Director 20 Inception 2010 Christopher Nolan 42 Life of Pi 2012 Ang Lee 53 Gravity 2013 Alfonso Cuaron R M UID SELECT R.UID 79 FROM Ratings R, Movies M 123 WHERE R.MID = M.MID AND M.Director = Ang Lee UNION UID SELECT R.UID FROM Ratings R, Movies M 79 WHERE R.MID = M.MID AND M.Year =

121 Semantics of UNION in SQL UID 79 UID 79 UNION UID UNION implicitly deduplicates tuples (unlike SELECT)! Q. How to retain duplicates with UNION? UID 79 UID 79 UNION ALL UID

122 INTERSECT in SQL RID Stars UID MID MID Name Year Director 20 Inception 2010 Christopher Nolan 42 Life of Pi 2012 Ang Lee 53 Gravity 2013 Alfonso Cuaron R M UID SELECT R.UID INTERSECT FROM Ratings R, Movies M 123 WHERE R.MID = M.MID AND M.Director = Ang Lee INTERSECT UID SELECT R.UID FROM Ratings R, Movies M 79 WHERE R.MID = M.MID AND M.Year = UID

123 EXCEPT (Set Difference) in SQL RID Stars UID MID MID Name Year Director 20 Inception 2010 Christopher Nolan 42 Life of Pi 2012 Ang Lee 53 Gravity 2013 Alfonso Cuaron R M UID SELECT R.UID 79 EXCEPT FROM Ratings R, Movies M WHERE R.MID = M.MID AND M.Director = Ang Lee EXCEPT UID SELECT R.UID FROM Ratings R, Movies M 79 WHERE R.MID = M.MID AND M.Year = UID

124 The Contentious Bag Semantics! UID UID UNION ALL UID Add the number of repetitions

125 The Contentious Bag Semantics! UID UID EXCEPT ALL UID Subtract the number of repetitions

126 The Contentious Bag Semantics! UID UID INTERSECT ALL UID Minimum of the number of repetitions

CSE 232A Graduate Database Systems

CSE 232A Graduate Database Systems CSE 232A Graduate Database Systems Arun Kumar Topic 3: Relational Operator Implementations Chapters 12 and 14 of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris 1 Lifecycle of a Query Query Result Query

More information

CSE 232A Graduate Database Systems

CSE 232A Graduate Database Systems CSE 232A Graduate Database Systems Arun Kumar Topic 4: Query Optimization Chapters 12 and 15 of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris 1 Lifecycle of a Query Query Result Query Database Server

More information

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Lifecycle of a Query CSE 190D Database System Implementation Arun Kumar Query Query Result

More information

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Lifecycle of an SQL Query CSE 190D base System Implementation Arun Kumar Query Query Result

More information

CSE 190D Database System Implementation

CSE 190D Database System Implementation CSE 190D Database System Implementation Arun Kumar Topic 1: Data Storage, Buffer Management, and File Organization Chapters 8 and 9 (except 8.5.4 and 9.2) of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris

More information

CS 525 Advanced Database Organization - Spring 2017 Mon + Wed 1:50-3:05 PM, Room: Stuart Building 111

CS 525 Advanced Database Organization - Spring 2017 Mon + Wed 1:50-3:05 PM, Room: Stuart Building 111 CS 525 Advanced Database Organization - Spring 2017 Mon + Wed 1:50-3:05 PM, Room: Stuart Building 111 Instructor: Boris Glavic, Stuart Building 226 C, Phone: 312 567 5205, Email: bglavic@iit.edu Office

More information

Relational Model and Algebra. Introduction to Databases CompSci 316 Fall 2018

Relational Model and Algebra. Introduction to Databases CompSci 316 Fall 2018 Relational Model and Algebra Introduction to Databases CompSci 316 Fall 2018 2 Announcements (Thu. Aug. 30) Sign up for Piazza, NOW! Homework #1 to be posted today; due in 2½ weeks Sign up for Gradiance

More information

CMPT 354: Database System I. Lecture 1. Course Introduction

CMPT 354: Database System I. Lecture 1. Course Introduction CMPT 354: Database System I Lecture 1. Course Introduction 1 Outline Motivation for studying this course Course admin and set up Overview of course topics 2 Trend 1: Data grows exponentially 1 ZB = 1,

More information

CSE 232A Graduate Database Systems

CSE 232A Graduate Database Systems CSE 232A Graduate Database Systems Arun Kumar Topic 1: Data Storage Chapters 8 and 9 of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris 1 Lifecycle of an SQL Query Query Result Query Database Server

More information

Introduction to Databases Fall-Winter 2010/11. Syllabus

Introduction to Databases Fall-Winter 2010/11. Syllabus Introduction to Databases Fall-Winter 2010/11 Syllabus Werner Nutt Syllabus Lecturer Werner Nutt, nutt@inf.unibz.it, Room POS 2.09 Office hours: Tuesday, 14:00 16:00 and by appointment (If you want to

More information

Relational Model and Relational Algebra

Relational Model and Relational Algebra Relational Model and Relational Algebra CMPSCI 445 Database Systems Fall 2008 Some slide content courtesy of Zack Ives, Ramakrishnan & Gehrke, Dan Suciu, Ullman & Widom Next lectures: Querying relational

More information

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems Relational Model & Algebra CompSci 316 Introduction to Database Systems Announcements (Tue. Sep. 3) Homework #1 has been posted Sign up for Gradiance now! Windows Azure passcode will be emailed soon sign

More information

Lecture 16. The Relational Model

Lecture 16. The Relational Model Lecture 16 The Relational Model Lecture 16 Today s Lecture 1. The Relational Model & Relational Algebra 2. Relational Algebra Pt. II [Optional: may skip] 2 Lecture 16 > Section 1 1. The Relational Model

More information

CSE 190D Spring 2017 Final Exam Answers

CSE 190D Spring 2017 Final Exam Answers CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join

More information

Relational algebra. Iztok Savnik, FAMNIT. IDB, Algebra

Relational algebra. Iztok Savnik, FAMNIT. IDB, Algebra Relational algebra Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book : R.Ramakrishnan,

More information

RELATIONAL ALGEBRA. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

RELATIONAL ALGEBRA. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan RELATIONAL ALGEBRA CS 564- Fall 2016 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan RELATIONAL QUERY LANGUAGES allow the manipulation and retrieval of data from a database two types of query languages: Declarative:

More information

Database Technology Introduction. Heiko Paulheim

Database Technology Introduction. Heiko Paulheim Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model

More information

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview Announcements Introduction to Databases CSE 414 Lecture 2: Data Models HW1 and WQ1 released Both due next Tuesday Office hours start this week Sections tomorrow Make sure you sign up on piazza Please ask

More information

Modern Database Systems CS-E4610

Modern Database Systems CS-E4610 Modern Database Systems CS-E4610 Aristides Gionis Michael Mathioudakis Spring 2017 what is a database? a collection of data what is a database management system?... a.k.a. database system software to store,

More information

Introduction to Data Management. Lecture #11 (Relational Algebra)

Introduction to Data Management. Lecture #11 (Relational Algebra) Introduction to Data Management Lecture #11 (Relational Algebra) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW and exams:

More information

CSCC43H: Introduction to Databases. Lecture 3

CSCC43H: Introduction to Databases. Lecture 3 CSCC43H: Introduction to Databases Lecture 3 Wael Aboulsaadat Acknowledgment: these slides are partially based on Prof. Garcia-Molina & Prof. Ullman slides accompanying the course s textbook. CSCC43: Introduction

More information

Introduction to Databases Fall-Winter 2009/10. Syllabus

Introduction to Databases Fall-Winter 2009/10. Syllabus Introduction to Databases Fall-Winter 2009/10 Syllabus Werner Nutt Syllabus Lecturer Werner Nutt, nutt@inf.unibz.it, Room TRA 2.01 Office hours: Thursday, 16:00 18:00 (If you want to meet up with me, send

More information

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra Note: Slides are posted on the class website, protected by a password written on the board Reading: see class home page www.cs.umb.edu/cs630. Relational Algebra CS430/630 Lecture 2 Relational Query Languages

More information

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy

Lecture 3 SQL - 2. Today s topic. Recap: Lecture 2. Basic SQL Query. Conceptual Evaluation Strategy 9/3/17. Instructor: Sudeepa Roy CompSci 516 Data Intensive Computing Systems Lecture 3 SQL - 2 Instructor: Sudeepa Roy Announcements HW1 reminder: Due on 09/21 (Thurs), 11:55 pm, no late days Project proposal reminder: Due on 09/20 (Wed),

More information

CAS CS 460/660 Introduction to Database Systems. Fall

CAS CS 460/660 Introduction to Database Systems. Fall CAS CS 460/660 Introduction to Database Systems Fall 2017 1.1 About the course Administrivia Instructor: George Kollios, gkollios@cs.bu.edu MCS 283, Mon 2:30-4:00 PM and Tue 1:00-2:30 PM Teaching Fellows:

More information

CMPSCI 645 Database Design & Implementation

CMPSCI 645 Database Design & Implementation Welcome to CMPSCI 645 Database Design & Implementation Instructor: Gerome Miklau Overview of Databases Gerome Miklau CMPSCI 645 Database Design & Implementation UMass Amherst Jan 19, 2010 Some slide content

More information

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018 CS430/630 Database Management Systems Fall 2018 People & Contact Information Instructor: Prof. Betty O Neil Email: eoneil AT cs DOT umb DOT edu (preferred contact) Web: http://www.cs.umb.edu/~eoneil Office:

More information

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems Relational Model & Algebra CPS 116 Introduction to Database Systems Announcements (Thu. Aug. 27) 2 Homework #1 will be assigned next Tuesday Office hours: see also course website Jun: LSRC D327 Tue. 1.5

More information

Relational Algebra and Calculus

Relational Algebra and Calculus and Calculus Marek Rychly mrychly@strathmore.edu Strathmore University, @ilabafrica & Brno University of Technology, Faculty of Information Technology Advanced Databases and Enterprise Systems 7 September

More information

CMPT 354: Database System I. Lecture 5. Relational Algebra

CMPT 354: Database System I. Lecture 5. Relational Algebra CMPT 354: Database System I Lecture 5. Relational Algebra 1 What have we learned Lec 1. DatabaseHistory Lec 2. Relational Model Lec 3-4. SQL 2 Why Relational Algebra matter? An essential topic to understand

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto Lecture 02.03. Query evaluation Combining operators. Logical query optimization By Marina Barsky Winter 2016, University of Toronto Quick recap: Relational Algebra Operators Core operators: Selection σ

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes Announcements Relational Model & Algebra CPS 216 Advanced Database Systems Lecture notes Notes version (incomplete) available in the morning on the day of lecture Slides version (complete) available after

More information

745: Advanced Database Systems

745: Advanced Database Systems 745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.

More information

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall Relational Algebra Study Chapter 4.1-4.2 Comp 521 Files and Databases Fall 2010 1 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model

More information

Database Systems CSE 414

Database Systems CSE 414 Database Systems CSE 414 Lecture 15-16: Basics of Data Storage and Indexes (Ch. 8.3-4, 14.1-1.7, & skim 14.2-3) 1 Announcements Midterm on Monday, November 6th, in class Allow 1 page of notes (both sides,

More information

CS 564: DATABASE MANAGEMENT SYSTEMS

CS 564: DATABASE MANAGEMENT SYSTEMS Spring 2017 CS 564: DATABASE MANAGEMENT SYSTEMS 1/17/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Teaching Staff Instructor: Jignesh Patel, Office Hours: Mon, Wed 9:15-10:30AM, CS 4357 Class:

More information

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems

Lecture 3 More SQL. Instructor: Sudeepa Roy. CompSci 516: Database Systems CompSci 516 Database Systems Lecture 3 More SQL Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Announcements HW1 is published on Sakai: Resources -> HW -> HW1 folder Due on

More information

CompSci 516 Data Intensive Computing Systems

CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 9 Join Algorithms and Query Optimizations Instructor: Sudeepa Roy CompSci 516: Data Intensive Computing Systems 1 Announcements Takeaway from Homework

More information

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy Announcements CompSci 516 Data Intensive Computing Systems Lecture 2 SQL Instructor: Sudeepa Roy If you are enrolled to the class, but have not received the email from Piazza, please send me an email All

More information

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook: CS 133: Databases Fall 2018 Lec 01 09/04 Introduction & Relational Model Data! Need systems to Data is everywhere Banking, airline reservations manage the data Social media, clicking anything on the internet

More information

CSE 190D Spring 2017 Final Exam

CSE 190D Spring 2017 Final Exam CSE 190D Spring 2017 Final Exam Full Name : Student ID : Major : INSTRUCTIONS 1. You have up to 2 hours and 59 minutes to complete this exam. 2. You can have up to one letter/a4-sized sheet of notes, formulae,

More information

Databases 1. Daniel POP

Databases 1. Daniel POP Databases 1 Daniel POP Week 4 Agenda The Relational Model 1. Origins and history 2. Key concepts 3. Relational integrity 4. Relational algebra 5. 12+1 Codd rules for a relational DBMSes 7. SQL implementation

More information

COMP-421: Database Systems. Joseph D silva McConnel Engg. 102

COMP-421: Database Systems. Joseph D silva McConnel Engg. 102 COMP-421: Database Systems Joseph D silva joseph.dsilva@mail.mcgill.ca McConnel Engg. 102 Class: Names and Numbers Mondays, Wednesdays 11:35-12:55 Lecturer: Joseph D silva joseph.dsilva@mail.mcgill.ca

More information

Relational Model, Relational Algebra, and SQL

Relational Model, Relational Algebra, and SQL Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 10, 2017 Sam Siewert RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and

More information

Relational Algebra. Chapter 4, Part A. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Relational Algebra. Chapter 4, Part A. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Algebra Chapter 4, Part A Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database.

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

CS 564: DATABASE MANAGEMENT SYSTEMS. Spring 2018

CS 564: DATABASE MANAGEMENT SYSTEMS. Spring 2018 CS 564: DATABASE MANAGEMENT SYSTEMS Spring 2018 DATA IS EVERYWHERE! Our world is increasingly data driven scientific discoveries online services (social networks, online retailers) decision making Databases

More information

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board Note: Slides are posted on the class website, protected by a password written on the board Reading: see class home page www.cs.umb.edu/cs630. Relational Algebra CS430/630 Lecture 2 Slides based on Database

More information

Relational Model: History

Relational Model: History Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) Relational Algebra Lecture 5, January 24, 2016 Mohammad Hammoud Today Last Session: The relational model Today s Session: Relational algebra Relational query languages (in

More information

Chapter 2 Introduction to Relational Models

Chapter 2 Introduction to Relational Models CMSC 461, Database Management Systems Spring 2018 Chapter 2 Introduction to Relational Models These slides are based on Database System Concepts book and slides, 6th edition, and the 2009 CMSC 461 slides

More information

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra Introduction to Data Management CSE 344 Lectures 8: Relational Algebra CSE 344 - Winter 2016 1 Announcements Homework 3 is posted Microsoft Azure Cloud services! Use the promotion code you received Due

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis Informatics 1: Data & Analysis Lecture 3: The Relational Model Ian Stark School of Informatics The University of Edinburgh Tuesday 24 January 2017 Semester 2 Week 2 https://blog.inf.ed.ac.uk/da17 Lecture

More information

Lecture 6 - More SQL

Lecture 6 - More SQL CMSC 461, Database Management Systems Spring 2018 Lecture 6 - More SQL These slides are based on Database System Concepts book and slides, 6, and the 2009/2012 CMSC 461 slides by Dr. Kalpakis Dr. Jennifer

More information

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre Overview of the Class and Introduction to DB schemas and queries Lois Delcambre 1 CS 386/586 Introduction to Databases Instructor: Lois Delcambre lmd@cs.pdx.edu 503 725-2405 TA: TBA Office Hours: Immediately

More information

CSE 344 JANUARY 3 RD - INTRODUCTION

CSE 344 JANUARY 3 RD - INTRODUCTION CSE 344 JANUARY 3 RD - INTRODUCTION COURSE FORMAT Lectures Location: SIG 134 Please attend Sections: Content: exercises, tutorials, questions, new materials (occasionally) Locations: see web Please attend

More information

B.C.A DATA BASE MANAGEMENT SYSTEM MODULE SPECIFICATION SHEET. Course Outline

B.C.A DATA BASE MANAGEMENT SYSTEM MODULE SPECIFICATION SHEET. Course Outline B.C.A 2017-18 DATA BASE MANAGEMENT SYSTEM Course Outline MODULE SPECIFICATION SHEET This course introduces the fundamental concepts necessary for designing, using and implementing database systems and

More information

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS The foundation of good database design Outline 1. Relational Algebra 2. Join 3. Updating/ Copy Table or Parts of Rows 4. Views (Virtual

More information

CS 245: Database System Principles

CS 245: Database System Principles CS 245: Database System Principles Notes 01: Introduction Peter Bailis CS 245 Notes 1 1 This course pioneered by Hector Garcia-Molina All credit due to Hector All mistakes due to Peter CS 245 Notes 1 2

More information

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls

Today s topics. Null Values. Nulls and Views in SQL. Standard Boolean 2-valued logic 9/5/17. 2-valued logic does not work for nulls Today s topics CompSci 516 Data Intensive Computing Systems Lecture 4 Relational Algebra and Relational Calculus Instructor: Sudeepa Roy Finish NULLs and Views in SQL from Lecture 3 Relational Algebra

More information

Faloutsos - Pavlo CMU SCS /615

Faloutsos - Pavlo CMU SCS /615 Faloutsos - Pavlo 15-415/615 Carnegie Mellon Univ. School of Computer Science 15-415/615 - DB Applications C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra Overview history concepts Formal query

More information

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations Relational Algebra R & G, Chapter 4 By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of

More information

Overview. Carnegie Mellon Univ. School of Computer Science /615 - DB Applications. Concepts - reminder. History

Overview. Carnegie Mellon Univ. School of Computer Science /615 - DB Applications. Concepts - reminder. History Faloutsos - Pavlo 15-415/615 Carnegie Mellon Univ. School of Computer Science 15-415/615 - DB Applications C. Faloutsos & A. Pavlo Lecture #4: Relational Algebra Overview history concepts Formal query

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Model & Languages Part-1 September 7, 2018 Sam Siewert More Embedded Systems Summer - Analog, Digital, Firmware, Software Reasons to Consider Catch

More information

CompSci 516: Database Systems

CompSci 516: Database Systems CompSci 516 Database Systems Lecture 4 Relational Algebra and Relational Calculus Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Reminder: HW1 Announcements Sakai : Resources

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management Tenth Edition Chapter 7 Introduction to Structured Query Language (SQL) Objectives In this chapter, students will learn: The basic commands and

More information

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

Announcements If you are enrolled to the class, but have not received the  from Piazza, please send me an  . Recap: Lecture 1. CompSci 516 Database Systems Lecture 2 SQL (Incomplete Notes) Instructor: Sudeepa Roy Announcements If you are enrolled to the class, but have not received the email from Piazza, please send me an email

More information

Relational Algebra 1

Relational Algebra 1 Relational Algebra 1 Relational Algebra Last time: started on Relational Algebra What your SQL queries are translated to for evaluation A formal query language based on operators Rel Rel Op Rel Op Rel

More information

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston CS430/630 Database Management Systems Spring, 2019 Betty O Neil University of Massachusetts at Boston People & Contact Information Instructor: Prof. Betty O Neil Email: eoneil AT cs DOT umb DOT edu (preferred

More information

CSE 344 JANUARY 26 TH DATALOG

CSE 344 JANUARY 26 TH DATALOG CSE 344 JANUARY 26 TH DATALOG ADMINISTRATIVE MINUTIAE HW3 and OQ3 out HW3 due next Friday OQ3 due next Wednesday HW4 out next week: on Datalog Midterm reminder: Feb 9 th RELATIONAL ALGEBRA Set-at-a-time

More information

CSE 344 Midterm Exam

CSE 344 Midterm Exam CSE 344 Midterm Exam February 9, 2015 Question 1 / 10 Question 2 / 39 Question 3 / 16 Question 4 / 28 Question 5 / 12 Total / 105 The exam is closed everything except for 1 letter-size sheet of notes.

More information

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow Relational Algebra R & G, Chapter 4 By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of

More information

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Database Management Systems Chapter 4 Relational Algebra Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Formal Relational Query Languages Two mathematical Query Languages form the basis

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) ER to Relational & Relational Algebra Lecture 4, January 20, 2015 Mohammad Hammoud Today Last Session: The relational model Today s Session: ER to relational Relational algebra

More information

Introduction to database design

Introduction to database design Introduction to database design First lecture: RG 3.6, 3.7, [4], most of 5 Second lecture: Rest of RG 5 Rasmus Pagh Some figures are taken from the ppt slides from the book Database systems by Kiefer,

More information

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13 CS121 MIDTERM REVIEW CS121: Relational Databases Fall 2017 Lecture 13 2 Before We Start Midterm Overview 3 6 hours, multiple sittings Open book, open notes, open lecture slides No collaboration Possible

More information

RELATIONAL ALGEBRA II. CS121: Relational Databases Fall 2017 Lecture 3

RELATIONAL ALGEBRA II. CS121: Relational Databases Fall 2017 Lecture 3 RELATIONAL ALGEBRA II CS121: Relational Databases Fall 2017 Lecture 3 Last Lecture 2 Query languages provide support for retrieving information from a database Introduced the relational algebra A procedural

More information

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE ADMINISTRATIVE MINUTIAE Midterm Exam: February 9 th : 3:30-4:20 Final Exam: March 15 th : 2:30 4:20 ADMINISTRATIVE MINUTIAE Midterm Exam: February

More information

SQL: Data Manipulation Language. csc343, Introduction to Databases Diane Horton Winter 2017

SQL: Data Manipulation Language. csc343, Introduction to Databases Diane Horton Winter 2017 SQL: Data Manipulation Language csc343, Introduction to Databases Diane Horton Winter 2017 Introduction So far, we have defined database schemas and queries mathematically. SQL is a formal language for

More information

Introduction to Databases CSE 414. Lecture 2: Data Models

Introduction to Databases CSE 414. Lecture 2: Data Models Introduction to Databases CSE 414 Lecture 2: Data Models CSE 414 - Autumn 2018 1 Class Overview Unit 1: Intro Unit 2: Relational Data Models and Query Languages Data models, SQL, Relational Algebra, Datalog

More information

SQL functions fit into two broad categories: Data definition language Data manipulation language

SQL functions fit into two broad categories: Data definition language Data manipulation language Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition Chapter 7 Beginning Structured Query Language (SQL) MDM NUR RAZIA BINTI MOHD SURADI 019-3932846 razia@unisel.edu.my

More information

Relational Algebra. Relational Query Languages

Relational Algebra. Relational Query Languages Relational Algebra π CS 186 Fall 2002, Lecture 7 R & G, Chapter 4 By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect,

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Lecture 1 - Introduction and the Relational Model 1 Outline Introduction Class overview Why database management systems (DBMS)? The relational model 2

More information

SQL: csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Sina Meraji. Winter 2018

SQL: csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Sina Meraji. Winter 2018 SQL: csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Sina Meraji Winter 2018 Introduction So far, we have defined database schemas and queries mathematically. SQL is a

More information

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday Announcements CSE 444: Database Internals Lecture 2 Review of the Relational Model Room change: Gowen (GWN) 301 on Monday, Friday Fisheries (FSH) 102 on Wednesday Lab 1 part 1 is due on Monday HW1 is due

More information

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 2 SQL Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Announcement If you are enrolled to the class, but

More information

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation!

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation! Chapter 3: Formal Relational Query Languages CS425 Fall 2013 Boris Glavic Chapter 3: Formal Relational Query Languages Relational Algebra Tuple Relational Calculus Domain Relational Calculus Textbook:

More information

Relational Algebra. Procedural language Six basic operators

Relational Algebra. Procedural language Six basic operators Relational algebra Relational Algebra Procedural language Six basic operators select: σ project: union: set difference: Cartesian product: x rename: ρ The operators take one or two relations as inputs

More information

CSC 261/461 Database Systems Lecture 13. Fall 2017

CSC 261/461 Database Systems Lecture 13. Fall 2017 CSC 261/461 Database Systems Lecture 13 Fall 2017 Announcement Start learning HTML, CSS, JavaScript, PHP + SQL We will cover the basics next week https://www.w3schools.com/php/php_mysql_intro.asp Project

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 624 Spring 2011 Overview of DB & IR Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/12/2011 Lipyeow Lim -- University of Hawaii at Manoa 1 Example

More information

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra Introduction to Data Management CSE 344 Lectures 8: Relational Algebra CSE 344 - Winter 2017 1 Announcements Homework 3 is posted Microsoft Azure Cloud services! Use the promotion code you received Due

More information

Chapter 3. The Relational Model. Database Systems p. 61/569

Chapter 3. The Relational Model. Database Systems p. 61/569 Chapter 3 The Relational Model Database Systems p. 61/569 Introduction The relational model was developed by E.F. Codd in the 1970s (he received the Turing award for it) One of the most widely-used data

More information

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy CompSci 516 Database Systems Lecture 2 SQL Instructor: Sudeepa Roy Duke CS, Fall 2018 1 Announcements If you are enrolled to the class, but have not received the email from Piazza, please send me an email

More information

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Set Theory Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Often, all members of a set have similar properties, such as odd numbers less than 10 or students in

More information

Introduction to Database Systems CSE 444. Lecture #1 March 26, 2007

Introduction to Database Systems CSE 444. Lecture #1 March 26, 2007 Introduction to Database Systems CSE 444 Lecture #1 March 26, 2007 1 About Me Dan Suciu: Joined the department in 2000 Before that: Bell Labs, AT&T Labs Research: Past: XML and semi-structured data: Query

More information

Introduction to Database S ystems Systems CSE 444 Lecture 1 Introduction CSE Summer

Introduction to Database S ystems Systems CSE 444 Lecture 1 Introduction CSE Summer Introduction to Database Systems CSE 444 Lecture 1 Introduction 1 Staff Instructor: Hal Perkins CSE 548, perkins@cs.washington.edu Office hours: labs tba, office drop-ins and appointments welcome TA: David

More information