data systems 101 prof. Stratos Idreos class 2

Similar documents
data systems 101 prof. Stratos Idreos class 2

column-stores basics

HOW INDEX TO STORE DATA DATA

basic db architectures & layouts

from bits to systems

column-stores basics

class 17 updates prof. Stratos Idreos

SQL & intro to db architectures

class 5 column stores 2.0 prof. Stratos Idreos

class 17 updates prof. Stratos Idreos

class 20 updates 2.0 prof. Stratos Idreos

Models & Intro to DB Architectures

class 9 fast scans 1.0 prof. Stratos Idreos

Models & Intro to DB Architectures

class 6 more about column-store plans and compression prof. Stratos Idreos

class 11 b-trees prof. Stratos Idreos

class 10 b-trees 2.0 prof. Stratos Idreos

systems & research project

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

complex plans and hybrid layouts

Course Introduction & History of Database Systems

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Overview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri

Introduction to Data Management. Lecture 14 (Storage and Indexing)

CMPSCI 645 Database Design & Implementation

Storing Data: Disks and Files

CS 245: Principles of Data-Intensive Systems. Instructor: Matei Zaharia cs245.stanford.edu

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

class 12 b-trees 2.0 prof. Stratos Idreos

CS 261 Fall Mike Lam, Professor. Memory

CAS CS 460/660 Introduction to Database Systems. Fall

OS and Hardware Tuning

Evolution of Database Systems

OS and HW Tuning Considerations!

Caches. Hiding Memory Access Times

class 13 scans vs indexes prof. Stratos Idreos

A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture

CSE 451: Operating Systems Spring Module 12 Secondary Storage

Low Overhead Concurrency Control for Partitioned Main Memory Databases. Evan P. C. Jones Daniel J. Abadi Samuel Madden"

Picture of memory. Word FFFFFFFD FFFFFFFE FFFFFFFF

class 8 b-trees prof. Stratos Idreos

CMPT 354: Database System I. Lecture 1. Course Introduction

Goals for Today. CS 133: Databases. Relational Model. Multi-Relation Queries. Reason about the conceptual evaluation of an SQL query

Cycle Time for Non-pipelined & Pipelined processors

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

The Memory Hierarchy 10/25/16

Advanced Database Systems

Introduction to Data Management. Lecture #13 (Indexing)

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

Disks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks

CompSci 516 Database Systems

Chapter Seven Morgan Kaufmann Publishers

Oracle Performance on M5000 with F20 Flash Cache. Benchmark Report September 2011

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CISC 7610 Lecture 2b The beginnings of NoSQL

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Storing Data: Disks and Files. Administrivia (part 2 of 2) Review. Disks, Memory, and Files. Disks and Files. Lecture 3 (R&G Chapter 7)

CS 261 Fall Mike Lam, Professor. Memory

18-447: Computer Architecture Lecture 16: Virtual Memory

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

CS122A: Introduction to Data Management. Lecture #14: Indexing. Instructor: Chen Li

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs"

L9: Storage Manager Physical Data Organization

Paging! 2/22! Anthony D. Joseph and Ion Stoica CS162 UCB Fall 2012! " (0xE0)" " " " (0x70)" " (0x50)"

LECTURE 10: Improving Memory Access: Direct and Spatial caches

4 Myths about in-memory databases busted

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

CS550. TA: TBA Office: xxx Office hours: TBA. Blackboard:

CSE544 Database Architecture

LECTURE 11. Memory Hierarchy

Information Systems (Informationssysteme)

CSE 544 Principles of Database Management Systems

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College

CompSci 516: Database Systems. Lecture 20. Parallel DBMS. Instructor: Sudeepa Roy

class 10 fast scans 2.0 prof. Stratos Idreos

CSCI1270 Introduction to Database Systems

Crescando: Predictable Performance for Unpredictable Workloads

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

Question?! Processor comparison!

Storing Data: Disks and Files

Disks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?

CS 405G: Introduction to Database Systems. Storage

Chapter 1 Introduction

Web Traffic - pct of Page Views

Overview of Data Management

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CPU Pipelining Issues

Modern Database Concepts

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

+ Random-Access Memory (RAM)

CS 261 Fall Mike Lam, Professor. Virtual Memory

CS162 Operating Systems and Systems Programming Lecture 13. Caches and TLBs. Page 1

Transcription:

class 2 data systems 101 prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS265/

2 classes per week - OH/Labs every day 1 presentation/discussion lead - 2 reviews each week research (or systems) project + midway check-in ~15-20 hours per week Stratos Idreos 2 /39

systems project research project individual project c/c++ groups of three analysis only open to cs165 students (unless proven otherwise) Stratos Idreos 3 /39

nosql key-value stores are everywhere project= basic state-of-the-art design Stratos Idreos 4 /39

Give me a design given read/write ratio. How new hardware X would impact performance? What hardware is needed to achieve throughput X? Stratos Idreos www.crimsondb.com 5 /39

big data V s (it is not about size only) volume velocity variety veracity actually none of that is really new new: our ability to gather and store machine generated data broad understanding that we cannot just manually get value out of data Stratos Idreos 6 /39

a data system stores data and provides access to data & makes knowledge generation easy Stratos Idreos 7 /39

a data system stores data and provides access to data & makes knowledge generation easy data system data analysis knowledge Stratos Idreos 7 /39

a data system stores data and provides access to data & makes knowledge generation easy data system X data analysis knowledge Stratos Idreos 7 /39

declarative interface ask what you want data system the system decides how to best store and access data Stratos Idreos 8 /39

applications sql database kernel algorithms/operators cpu memory data data data disk Stratos Idreos 9 /39

SQL complex legacy tuning expensive nosql simple clean just enough as apps become more complex as apps need to be more scalable newsql goal: understand why, how and what is next Stratos Idreos 10/39

SQL complex legacy tuning expensive nosql simple clean just enough Gartner: DBMS Market = ~36Billion nosql/hadoop =~700Million SQL = ~rest as apps become more complex as apps need to be more scalable newsql goal: understand why, how and what is next Stratos Idreos 10/39

more applications more data continuous need for new systems more h/w Stratos Idreos 11/39

1 soon everyone will need to be a data scientist hmm, my data is too big :( how far away are we from a future where a data system sits in the critical path of everything we do? new applications/requirements Stratos Idreos 12/39

2 data exploration not always sure what we are looking for (until we find it) Stratos Idreos 13 /39

3 daily data data* daily skills data years [IBMbigdata] years [StratosGuess] data system design, set-up, tune, use Stratos Idreos 14/39

sql cpu - cpu - cpu - cpu cpu registers smaller/faster caches memory disk - disk - disk - disk + flash + non volatile memory memory hierarchy system where db runs Stratos Idreos 15/39

registers 2x on chip cache 10x on board cache 100x memory my head ~0 this room 1 min this building 10 min New York 1.5 hours Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Innovations award 100Kx disk Pluto 2 years Stratos Idreos 16/39

cheaper faster CPU registers on chip cache on board cache memory disk SRAM DRAM cache miss: looking for something which is not in the cache ~1ns ~10ns ~100ns speed memory wall memory miss: looking for something which is not in memory cpu mem time Stratos Idreos 17/39

and touch/access only what you need design of storage/access methods/algorithms should minimize: data misses + instruction misses Stratos Idreos 18/39

random access & page-based access CPU need to only read x but have to read all of page 1 registers data value x on chip cache page1 page2 page3 data move on board cache memory disk Stratos Idreos 19/39

query x<5 (size=120 bytes) memory level N memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

query x<5 scan (size=120 bytes) memory level N 5 10 6 4 12 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

query x<5 (size=120 bytes) memory level N memory level N-1 scan 5 10 6 4 12 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 scan 5 10 6 4 12 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 scan 5 10 6 4 12 scan 2 8 9 7 6 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 scan 5 10 6 4 12 scan 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

80 bytes query x<5 (size=120 bytes) memory level N memory level N-1 scan 5 10 6 4 12 scan 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

80 bytes query x<5 (size=120 bytes) memory level N 2 8 9 7 6 4 2 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

80 bytes query x<5 (size=120 bytes) memory level N memory level N-1 scan 7 11 3 9 6 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

80 bytes query x<5 scan (size=120 bytes) memory level N 7 11 3 9 6 2 8 9 7 6 4 2 3 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

120 bytes query x<5 scan (size=120 bytes) memory level N 7 11 3 9 6 2 8 9 7 6 4 2 3 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 20/39

an oracle gives us the positions query x<5 (size=120 bytes) memory level N memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions query x<5 oracle (size=120 bytes) memory level N 5 10 6 4 12 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions query x<5 oracle (size=120 bytes) memory level N memory level N-1 5 10 6 4 12 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 oracle 5 10 6 4 12 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 oracle 5 10 6 4 12 oracle 2 8 9 7 6 4 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 40 bytes query x<5 (size=120 bytes) memory level N memory level N-1 oracle 5 10 6 4 12 oracle 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 80 bytes query x<5 (size=120 bytes) memory level N memory level N-1 oracle 5 10 6 4 12 oracle 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 80 bytes query x<5 (size=120 bytes) memory level N 2 8 9 7 6 4 2 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 80 bytes query x<5 (size=120 bytes) memory level N memory level N-1 oracle 7 11 3 9 6 2 8 9 7 6 4 2 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 80 bytes query x<5 oracle (size=120 bytes) memory level N 7 11 3 9 6 2 8 9 7 6 4 2 3 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

an oracle gives us the positions 120 bytes query x<5 oracle (size=120 bytes) memory level N 7 11 3 9 6 2 8 9 7 6 4 2 3 memory level N-1 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 page size: 5x8 bytes Stratos Idreos 21/39

when does it make sense to have an oracle how can we minimize the cost e.g., query x<5 5 10 6 4 12 2 8 9 7 6 7 11 3 9 6 Stratos Idreos 22/39

design space it all starts with how we store data every bit matters Stratos Idreos 23/39

in parallel/prefetching sequential access: read one block; consume it completely; discard it; read next what is next? 1 2 3 4 hardware can better predict/buffer sequential pages to be read e.g., 2MB buffers in modern DRAM amortize cost of moving disk arms Stratos Idreos 24/39

random access: read one block; consume it partially; discard it; might have to read it again in future; read random next; Stratos Idreos 25/39

C/C++ no libraries unless we explicitly allow it we expect you build everything from scratch so you can control storage and access 100% Stratos Idreos 26/39

MAIN-MEMORY OPTIMIZED DATA SYSTEMS Stratos Idreos 27/39

a simple example assume an array of N integers: find all positions where value>x qualifying positions select operator exists in all systems: sql, nosql, newsql data Stratos Idreos 28/39

assume an array of N integers: find all positions where value>x res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i what if only 1% qualifies? Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data res Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i memory data copy res Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if only 1% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i data but how can we know? memory copy res Stratos Idreos 29/39

assume an array of N integers: find all positions where value>x res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i what if 90% qualifies? result size= qualifying values*x bytes Stratos Idreos 30/39

assume an array of N integers: find all positions where value>x res=new array[data.size] what if 90% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i result size= qualifying values*x bytes bit vector for res? 1 0 0 0 1 1 1 0 Stratos Idreos 30/39 1 vs 00000001 00000101 00000110 00000111 00001001

assume an array of N integers: find all positions where value>x res=new array[data.size] what if 90% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i result size= qualifying values*x bytes bit vector for res? 1 0 0 0 1 if statements= bad, bad, bad Stratos Idreos 30/39 1 1 0 1 vs 00000001 00000101 00000110 00000111 00001001

assume an array of N integers: find all positions where value>x and we haven t even started discussing about how to find the qualifying values res=new array[data.size] what if 90% qualifies? j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i result size= qualifying values*x bytes bit vector for res? 1 0 0 0 1 if statements= bad, bad, bad Stratos Idreos 30/39 1 1 0 1 vs 00000001 00000101 00000110 00000111 00001001

assume an array of N integers: find all positions where value>x thread1, core1 res=new array[data.size] thread2, core2 j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i logically partition thread3, core3 Stratos Idreos 31/39

assume an array of N integers: find all positions where value>x thread1, core1 res=new array[data.size] thread2, core2 j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i logically partition thread3, core3 NUMA architectures? SIMD functionality? & what about result writing? not as simple as spinning off N threads Stratos Idreos 31/39

assume an array of N integers: find all positions where value>x N>>1 queries in parallel res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i q1,q2,q3 Stratos Idreos 32/39

assume an array of N integers: find all positions where value>x N>>1 queries in parallel res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i q1,q2,q3 Stratos Idreos 32/39

assume an array of N integers: find all positions where value>x N>>1 queries in parallel res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i q1,q2,q3 Stratos Idreos 32/39

assume an array of N integers: find all positions where value>x N>>1 queries in parallel res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i q1,q2,q3 q4 Stratos Idreos 32/39

assume an array of N integers: find all positions where value>x N>>1 queries in parallel res=new array[data.size] j=0 for (i=0; i<data.size; i++) if data[i]>x res[j++]=i q1,q2,q3 q4 q4 Stratos Idreos 32/39

assume an array of N integers: find all positions where value>x res=new array[10] j=0 for (i=0; i<10; i++) if data[i]>x res[j++]=i vs res=new array[10] j=0 if data[0]>x res[j++]=i if data[1]>x res[j++]=i if data[2]>x res[j++]=i if data[3]>x res[j++]=i if data[4]>x res[j++]=i if data[5]>x res[j++]=i if data[6]>x res[j++]=i if data[7]>x res[j++]=i if data[8]>x res[j++]=i if data[9]>x res[j++]=i Stratos Idreos 33/39

assume an array of N integers: find all positions where value>x option 1: scan all data option 2: use a tree (do not consider tree generation costs) which one is best Stratos Idreos 34/39

cost: data touched & computation speed cpu mem time Stratos Idreos 35/39

design logical design physical design system design Stratos Idreos 36/39

how can I prepare? 1) start browsing some basic texts Get familiar with the very basics of traditional database architectures: Architecture of a Database System. By J. Hellerstein, M. Stonebraker and J. Hamilton. Foundations and Trends in Databases, 2007 Get familiar with very basics of modern database architectures: The Design and Implementation of Modern Column-store Database Systems. By D. Abadi, P. Boncz, S. Harizopoulos, S. Idreos, S. Madden. Foundations and Trends in Databases, 2013 Get familiar with the very basics of modern large scale systems: Massively Parallel Databases and MapReduce Systems. By Shivnath Babu and Herodotos Herodotou. Foundations and Trends in Databases, 2013 2) play with basic data structures implementation in C (linked list/hash table/tree) Stratos Idreos 37/39

Action steps: 1) Read the syllabus & website carefully, 2) Register to Piazza, 3) Do P0 if you have not taken CS165 and check self-test, 4) Register for paper presentation (week 2), 5) Start submitting your paper reviews (week 3) web site: http://daslab.seas.harvard.edu/classes/cs265/ piazza: piazza.com/harvard/spring2018/cs265/home office hours: Stratos: Wed/Thur/Fri, 3-4pm, MD139 TF office hours: check website textbook: nope research papers will be available from the Harvard network Stratos Idreos 38/39

class 2 data systems 101 BIG DATA SYSTEMS prof. Stratos Idreos next week modern main-memory optimized data systems & semester projects