How They Work. Larry Page (one of the founders of Google) wrote the following in July 1996:

Size: px
Start display at page:

Download "How They Work. Larry Page (one of the founders of Google) wrote the following in July 1996:"

Transcription

1 Databases How They Work Back in 1996 Larry Page (one of the founders of Google) wrote the following in July 1996: I am almost out of disk space. I have downloaded about 24 million unique URLs and about 100 million links I think I will need 8 gigs more to store everything Current retail prices are about $1000/4 gigs I have only about 15% of the pages but it seems promising.

2 Why Databases? Now (2006) there are tens of billions (1 billion = 10^9) of documents on the Internet. How can anyone find a needle in this haystack? Answer: Pointers to documents are stored in databases in the form of keywords so that specific documents can be found and retrieved by using a combination of search terms. Data Storage Background Suppose there are 20 billion (=20x10 9 ) documents and the first 100 characters of each are stored using 1 byte (=8 bits) per character => 20x10 9 x100 = 2x10 12 bytes or 2 trillion (= 2000 billion) bytes of storage are needed. A 200 GB (gigabyte) hard drive costs about $200 => need 10 drives, $2000 which is quite reasonable.

3 Reading Hard Drive Data is written in the form of concentric tracks onto the platters in the hard drive. Each track can hold kbytes (kilo-bytes) and the platters rotate at approx 7200 rpm (=7200/60=120 rev/sec). Assuming 167 kbytes/track, the read rate is 120x167x10 3 =20 MB/s (megabytes/sec). But 200GB/20MB/s = 2x10 11 /2x10 7 = 10 4 sec. Note that sec = 2.78 hours for 200 GB. Direct search/read from hard drives not feasible! Database Organization Desirable properties of a database: Answer queries fast with options to narrow or broaden search results. Easy to update and maintain without introducing inconsistencies. Flexible and easy to expand as data and users evolve. Scalable as size increases (by orders of magnitude, e.g., from millions to billions).

4 Card File of Mini Zoo Name: Habitat: Colors: Speed: Caretaker: (6-8 am, 3-6 pm) Name: Habitat:, Colors: Speed: Caretaker: (10 am - 4 pm) Name: Habitat:, Colors:,, Speed: Caretaker: (10 am - 4 pm) Name: Habitat: Colors:,, Speed: Caretaker: (6-7 am, 3-4 pm) Name: Habitat: Colors: Speed: Caretaker: (10 am - 4 pm) Name: Habitat: Colors:, Red Speed: Caretaker: (6-8 am, 3-6 pm) Name: Habitat: Colors: Yellow Speed: Caretaker: (6-7 am, 3-4 pm) Suppose changes his schedule. Then we need to go through all cards and enter the change. Database Queries We could ask some of the following questions: How many animals are fast? Which animals live in water and are grey? How many animals have as caretaker? Which animals are either green or fast?

5 Card File --> Flat File Suppose we convert the card file to an electronic database on a computer. A straightforward solution is to enter each card into a table that has as many columns as there are features to record. For example, since some animals have 3 colors, we need 3 columns, labeled color1, color2, and color3. Flat File Database # Name Habitat1 Habitat2 Color1 Color2 Color3 Speed Caretaker Yellow Red (6-8 am, 3-6 pm) (10 am - 4 pm) (10 am - 4 pm) (10 am - 4 pm) (6-7 am, 3-4 pm) (6-8 am, 3-6 pm) (6-7 am, 3-4 pm) Fixed number of columns. Wastes space if most animals have one or two colors. Suppose an exotic bird has four colors, or some animals have two caretakers that take turns. This means that the whole database needs to be enlarged.

6 Flat File Database Now we can search electronically, but we still need to look at every record to find, for example, which animals are black. Also, if suddenly we have two caretakers that take turns in caring for some of the animals, we need to add a column to each record in the database. Hierarchical Database It would be nice to be able to find a record quickly by successive refining of the search query. For example, we could ask for all animals that live on land. And then we could ask for all animals that live on land and are fast.

7 Hierarchical Database DBMS 6-7 am 3-4 pm Yellow 6-7 am 3-4 pm Red 6-8 am 3-6 pm 6-8 am 3-6 pm Hierarchical Database But there is still a lot of redundancy. For example, duck is listed twice. What if we just ask How many animals are fast?

8 Network Database (6-7, 3-4) (10-4) (6-8, 3-6) Red Yellow Network Database Has very little redundancy. We can easily ask How many animals does take care of? Gets very complex quickly.

9 Relational Database Speed Table Name Speed Caretaker Table Name Caretaker Habitat Table Name Habitat Color Table Name Color Caretaker Schedule Table Caretaker Schedule Red Yellow 6-7 am 3-4 pm 10 am - 4 pm 6-8 am 3-6 pm Relational Database All information in the data base is represented as values in tables. Each property (e.g., color or caretaker) appears once and only once as a column in a table. Each datum in the database is accessible by a combination of table name, column name, and primary key (e.g., fish, duck) value.

10 After Sorting wrt Column 2 Speed Table Name Speed Caretaker Table Name Caretaker Habitat Table Name Habitat Color Table Name Color Caretaker Schedule Table Caretaker Schedule 6-7 am 6-8 am 10 am - 4 pm 3-4 pm 3-6 pm Red Yellow = {, } = {,, } = {,,,, } AND = {} AND = {, } OR = {,,,,, } Sets, Boolean Operators Set A Set B A AND B A OR B Note: A OR B = A + B -(A AND B)

11 Sets, Boolean Operators Set A NOT A A AND NOT A A OR NOT A Empty set (nothing) Everything (whole database) Truth Tables 0: false, 1: true

Worksheet - Storing Data

Worksheet - Storing Data Unit 1 Lesson 12 Name(s) Period Date Worksheet - Storing Data At the smallest scale in the computer, information is stored as bits and bytes. In this section, we'll look at how that works. Bit Bit, like

More information

File System Structure. Kevin Webb Swarthmore College March 29, 2018

File System Structure. Kevin Webb Swarthmore College March 29, 2018 File System Structure Kevin Webb Swarthmore College March 29, 2018 Today s Goals Characterizing disks and storage media File system: adding order and structure to storage FS abstractions (files, directories,

More information

Discussion. Why do we use Base 10?

Discussion. Why do we use Base 10? MEASURING DATA Data (the plural of datum) are anything in a form suitable for use with a computer. Whatever a computer receives as an input is data. Data are raw facts without any clear meaning. Computers

More information

IST346. Data Storage

IST346. Data Storage IST346 Data Storage Data Storage Why Data Storage? Information is a the center of all organizations. Organizations need to store data. Lots of it. What Kinds of Data? Documents and Files (Reports, Proposals,

More information

The Server-Storage Performance Gap

The Server-Storage Performance Gap The Server-Storage Performance Gap How disk drive throughput and access time affect performance November 2010 2 Introduction In enterprise storage configurations and data centers, hard disk drives serve

More information

Chapter 18 Indexing Structures for Files

Chapter 18 Indexing Structures for Files Chapter 18 Indexing Structures for Files Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Disk I/O for Read/ Write Unit for Disk I/O for Read/ Write: Chapter 18 One Buffer for

More information

CSE 373: Data Structures and Algorithms. Memory and Locality. Autumn Shrirang (Shri) Mare

CSE 373: Data Structures and Algorithms. Memory and Locality. Autumn Shrirang (Shri) Mare CSE 373: Data Structures and Algorithms Memory and Locality Autumn 2018 Shrirang (Shri) Mare shri@cs.washington.edu Thanks to Kasey Champion, Ben Jones, Adam Blank, Michael Lee, Evan McCarty, Robbie Weber,

More information

Storage Devices for Database Systems

Storage Devices for Database Systems Storage Devices for Database Systems 5DV120 Database System Principles Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Storage Devices for

More information

secondary storage: Secondary Stg Any modern computer system will incorporate (at least) two levels of storage:

secondary storage: Secondary Stg Any modern computer system will incorporate (at least) two levels of storage: Secondary Storage 1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: random access memory (RAM) typical capacity 256MB to 4GB cost per MB $0.10 typical access

More information

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks. The Lecture Contains: Memory organisation Example of memory hierarchy Memory hierarchy Disks Disk access Disk capacity Disk access time Typical disk parameters Access times file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture4/4_1.htm[6/14/2012

More information

Some Practice Problems on Hardware, File Organization and Indexing

Some Practice Problems on Hardware, File Organization and Indexing Some Practice Problems on Hardware, File Organization and Indexing Multiple Choice State if the following statements are true or false. 1. On average, repeated random IO s are as efficient as repeated

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi. Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

More information

COMP 102: Computers and Computing

COMP 102: Computers and Computing COMP 102: Computers and Computing Lecture 2: Bits&bytes, Switches, and Boolean Logic Instructor: Kaleem Siddiqi (siddiqi@cim.mcgill.ca) Class web page: www.cim.mcgill.ca/~siddiqi/102.html The Lowly Bit

More information

Appendix D: Storage Systems

Appendix D: Storage Systems Appendix D: Storage Systems Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Storage Systems : Disks Used for long term storage of files temporarily store parts of pgm

More information

Information Systems (Informationssysteme)

Information Systems (Informationssysteme) Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IX B-Trees c Jens Teubner Information

More information

Hadoop and Map-reduce computing

Hadoop and Map-reduce computing Hadoop and Map-reduce computing 1 Introduction This activity contains a great deal of background information and detailed instructions so that you can refer to it later for further activities and homework.

More information

NOTE: sorting using B-trees to be assigned for reading after we cover B-trees.

NOTE: sorting using B-trees to be assigned for reading after we cover B-trees. External Sorting Chapter 13 (Sec. 13-1-13.5): Ramakrishnan & Gehrke and Chapter 11 (Sec. 11.4-11.5): G-M et al. (R2) OR Chapter 2 (Sec. 2.4-2.5): Garcia-et Molina al. (R1) NOTE: sorting using B-trees to

More information

The Anatomy of a Large-Scale Hypertextual Web Search Engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine The Anatomy of a Large-Scale Hypertextual Web Search Engine Article by: Larry Page and Sergey Brin Computer Networks 30(1-7):107-117, 1998 1 1. Introduction The authors: Lawrence Page, Sergey Brin started

More information

Searching for Information

Searching for Information Searching for Information The Searching Process-How do I start? When faced with a task that requires you to search for information, it can be quite overwhelming. Here are some important things to think

More information

Disk Scheduling COMPSCI 386

Disk Scheduling COMPSCI 386 Disk Scheduling COMPSCI 386 Topics Disk Structure (9.1 9.2) Disk Scheduling (9.4) Allocation Methods (11.4) Free Space Management (11.5) Hard Disk Platter diameter ranges from 1.8 to 3.5 inches. Both sides

More information

(Refer Slide Time: 01:25)

(Refer Slide Time: 01:25) Computer Architecture Prof. Anshul Kumar Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture - 32 Memory Hierarchy: Virtual Memory (contd.) We have discussed virtual

More information

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Week 02 Module 06 Lecture - 14 Merge Sort: Analysis

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Week 02 Module 06 Lecture - 14 Merge Sort: Analysis Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute Week 02 Module 06 Lecture - 14 Merge Sort: Analysis So, we have seen how to use a divide and conquer strategy, we

More information

The personal computer system uses the following hardware device types -

The personal computer system uses the following hardware device types - EIT, Author Gay Robertson, 2016 The personal computer system uses the following hardware device types - Input devices Input devices Processing devices Storage devices Processing Cycle Processing devices

More information

Review Web Search. Functions & Functional Abstraction & More Functions. Natural Language and Dialogue Systems Lab

Review Web Search. Functions & Functional Abstraction & More Functions. Natural Language and Dialogue Systems Lab Review Web Search Functions & Functional Abstraction & More Functions Natural Language and Dialogue Systems Lab Web search: It Matters How It Works 1. Gather information. 2. Keep copies. 3. Build an index.

More information

Introduction to the Mathematics of Big Data. Philippe B. Laval

Introduction to the Mathematics of Big Data. Philippe B. Laval Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2017 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,

More information

Components of the Virtual Memory System

Components of the Virtual Memory System Components of the Virtual Memory System Arrows indicate what happens on a lw virtual page number (VPN) page offset virtual address TLB physical address PPN page offset page table tag index block offset

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction Inverted index Processing Boolean queries Course overview Introduction to Information Retrieval http://informationretrieval.org IIR 1: Boolean Retrieval Hinrich Schütze Institute for Natural

More information

Organization of a Surface

Organization of a Surface Organization of a Surface Each disk surface is partitioned into a number of concentric tracks. Each track is partitioned into a number of sectors. Each sector contains 512 bytes of data, plus control information

More information

What is a Database? CMPSCI 105: Lecture #15 Introduction to Databases. Spreadsheets vs. Databases. One Skill You Will Develop

What is a Database? CMPSCI 105: Lecture #15 Introduction to Databases. Spreadsheets vs. Databases. One Skill You Will Develop What is a Database? CMPSCI 105: Lecture #15 Introduction to Databases 2014 2019 Dr. William T. Verts A Database, like a Spreadsheet, is a way of structuring information in order to solve problems, Unlike

More information

C has been and will always remain on top for performancecritical

C has been and will always remain on top for performancecritical Check out this link: http://spectrum.ieee.org/static/interactive-the-top-programminglanguages-2016 C has been and will always remain on top for performancecritical applications: Implementing: Databases

More information

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms! Professor: Pete Keleher! keleher@cs.umd.edu! } Mechanisms and definitions to work with FDs! Closures, candidate keys, canonical covers etc! Armstrong axioms! } Decompositions! Loss-less decompositions,

More information

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3 EECS 262a Advanced Topics in Computer Systems Lecture 3 Filesystems (Con t) September 10 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering and Computer Sciences University of California,

More information

Paradigm Shift of Database

Paradigm Shift of Database Paradigm Shift of Database Prof. A. A. Govande, Assistant Professor, Computer Science and Applications, V. P. Institute of Management Studies and Research, Sangli Abstract Now a day s most of the organizations

More information

TEN TRAPS FOR ATTORNEYS TO AVOID IN IP SEQUENCE SEARCH AND ANALYSIS

TEN TRAPS FOR ATTORNEYS TO AVOID IN IP SEQUENCE SEARCH AND ANALYSIS TEN TRAPS FOR ATTORNEYS TO AVOID IN IP SEQUENCE SEARCH AND ANALYSIS The growth of sequence IP is nothing short of amazing! In 2007, we had about 50 million sequences ten years later, we are fast approaching

More information

COMP 273 Winter physical vs. virtual mem Mar. 15, 2012

COMP 273 Winter physical vs. virtual mem Mar. 15, 2012 Virtual Memory The model of MIPS Memory that we have been working with is as follows. There is your MIPS program, including various functions and data used by this program, and there are some kernel programs

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Announcements ECE Honors Exhibition Wednesday, April

More information

High-Performance Storage Systems

High-Performance Storage Systems High-Performance Storage Systems I/O Systems Processor interrupts Cache Memory - I/O Bus Main Memory I/O Controller I/O Controller I/O Controller Disk Disk Graphics Network 2 Storage Technology Drivers

More information

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 530A. B+ Trees. Washington University Fall 2013 CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key

More information

2. Give an example of algorithm instructions that would violate the following criteria: (a) precision: a =

2. Give an example of algorithm instructions that would violate the following criteria: (a) precision: a = CSC105, Introduction to Computer Science Exercises NAME DIRECTIONS. Complete each set of problems. Provide answers and supporting work as prescribed I. Algorithms. 1. Write a pseudocoded algorithm for

More information

Introduction. Table of Contents

Introduction. Table of Contents Introduction This is an informal manual on the gpu search engine 'gpuse'. There are some other documents available, this one tries to be a practical how-to-use manual. Table of Contents Introduction...

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Summary of the FS abstraction User's view Hierarchical structure Arbitrarily-sized files Symbolic file names Contiguous address space

More information

Computer Organization and Technology External Memory

Computer Organization and Technology External Memory Computer Organization and Technology External Memory Assoc. Prof. Dr. Wattanapong Kurdthongmee Division of Computer Engineering, School of Engineering and Resources, Walailak University 1 Magnetic Disk

More information

Top 10 pre-paid SEO tools

Top 10 pre-paid SEO tools Top 10 pre-paid SEO tools Introduction In historical terms, Google regularly updates its search algorithms judging by the previous years. Predictions for the 2016 tell us that the company work process

More information

Tracking Rumors. Tracking Rumors. Representing Rumor Mills. Representing Rumor Mills. Suppose that we want to track gossip in a rumor mill

Tracking Rumors. Tracking Rumors. Representing Rumor Mills. Representing Rumor Mills. Suppose that we want to track gossip in a rumor mill Tracking Rumors Suppose that we want to track gossip in a rumor mill Tracking Rumors Simplifying assumption: each person tells at most two others Representing Rumor Mills Representing Rumor Mills Is a

More information

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static

More information

How TokuDB Fractal TreeTM. Indexes Work. Bradley C. Kuszmaul. Guest Lecture in MIT Performance Engineering, 18 November 2010.

How TokuDB Fractal TreeTM. Indexes Work. Bradley C. Kuszmaul. Guest Lecture in MIT Performance Engineering, 18 November 2010. 6.172 How Fractal Trees Work 1 How TokuDB Fractal TreeTM Indexes Work Bradley C. Kuszmaul Guest Lecture in MIT 6.172 Performance Engineering, 18 November 2010. 6.172 How Fractal Trees Work 2 I m an MIT

More information

FA269 - DIGITAL MEDIA AND CULTURE

FA269 - DIGITAL MEDIA AND CULTURE FA269 - DIGITAL MEDIA AND CULTURE ST. LAWRENCE UNIVERSITY a. hauber http://blogs.stlawu.edu/digitalmedia DIGITAL TECHNICAL PRIMER INCLUDED HERE ARE THE FOLLOWING TOPICS A. WHAT IS A COMPUTER? B. THE DIFFERENCE

More information

Fast Approximations for Analyzing Ten Trillion Cells. Filip Buruiana Reimar Hofmann

Fast Approximations for Analyzing Ten Trillion Cells. Filip Buruiana Reimar Hofmann Fast Approximations for Analyzing Ten Trillion Cells Filip Buruiana (filipb@google.com) Reimar Hofmann (reimar.hofmann@hs-karlsruhe.de) Outline of the Talk Interactive analysis at AdSpam @ Google Trade

More information

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage CSCI-GA.2433-001 Database Systems Lecture 8: Physical Schema: Storage Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com View 1 View 2 View 3 Conceptual Schema Physical Schema 1. Create a

More information

Introduction to I/O. April 30, Howard Huang 1

Introduction to I/O. April 30, Howard Huang 1 Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.

More information

Data Mining & Data Warehouse

Data Mining & Data Warehouse Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Information Retrieval and Web Search Lecture 1: Introduction and Boolean retrieval Outline ❶ Course details ❷ Information retrieval ❸ Boolean retrieval 2 Course details

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部

计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部 计算机信息表达 刘志磊天津大学智能与计算学部 Bits & Bytes Bytes & Letters More Bytes Bit ( 位 ) the smallest unit of storage Everything in a computer is 0 s and 1 s Bits why? Computer Hardware Chip uses electricity 0/1 states

More information

Promoting Website CS 4640 Programming Languages for Web Applications

Promoting Website CS 4640 Programming Languages for Web Applications Promoting Website CS 4640 Programming Languages for Web Applications [Jakob Nielsen and Hoa Loranger, Prioritizing Web Usability, Chapter 5] [Sean McManus, Web Design, Chapter 15] 1 Search Engine Optimization

More information

Backing Storage Media

Backing Storage Media Backing Storage Media Key Words The following words will crop up as part of the following presentation. You should use your notes sheet to log information about them when it is covered. You will be quizzed

More information

,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions.

,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions. Number Place Names NS-PV 1 Memorizing the most common number place names will be much easier once you recognize how the pattern of tens and hundreds names are repeated as prefixes for each group of three

More information

,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions.

,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions. Number Place Names NS-PV Memorizing the most common number place names will be much easier once you recognize how the pattern of tens and hundreds names are repeated as prefixes for each group of three

More information

Red-Black trees are usually described as obeying the following rules :

Red-Black trees are usually described as obeying the following rules : Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for

More information

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1 Lecture 29 Reminder: Homework 7 is due on Monday at class time for Exam 2 review; no late work accepted. Reminder: Exam 2 is on Wednesday. Exam 2 review sheet is posted. Questions? Friday, March 23 CS

More information

Computers 101. Lecture 1

Computers 101. Lecture 1 Computers 101 Lecture 1 Announcements Open House/Software install help session TODAY! Sitterson 008: 1pm 6pm Get help installing course software and meet the UTA team Have this done by Wednesday 1/17 Office

More information

Elementary IR: Scalable Boolean Text Search. (Compare with R & G )

Elementary IR: Scalable Boolean Text Search. (Compare with R & G ) Elementary IR: Scalable Boolean Text Search (Compare with R & G 27.1-3) Information Retrieval: History A research field traditionally separate from Databases Hans P. Luhn, IBM, 1959: Keyword in Context

More information

Lecture 12. Lecture 12: The IO Model & External Sorting

Lecture 12. Lecture 12: The IO Model & External Sorting Lecture 12 Lecture 12: The IO Model & External Sorting Lecture 12 Today s Lecture 1. The Buffer 2. External Merge Sort 2 Lecture 12 > Section 1 1. The Buffer 3 Lecture 12 > Section 1 Transition to Mechanisms

More information

Review question: Protection and Security *

Review question: Protection and Security * OpenStax-CNX module: m28010 1 Review question: Protection and Security * Duong Anh Duc This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Review question

More information

Segmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS)

Segmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Review Segmentation Segmentation Implementation Advantage of Segmentation Protection Sharing Segmentation with Paging Segmentation with Paging Segmentation with Paging Reason for the segmentation with

More information

The Transition to Networked Storage

The Transition to Networked Storage The Transition to Networked Storage Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary... 3 2.0 The Emergence of the Storage Area Network... 3 3.0 The Link Between Business

More information

I/O and file systems. Dealing with device heterogeneity

I/O and file systems. Dealing with device heterogeneity I/O and file systems Abstractions provided by operating system for storage devices Heterogeneous -> uniform One/few storage objects (disks) -> many storage objects (files) Simple naming -> rich naming

More information

Biostatistics 615/815 - Lecture 2 Introduction to C++ Programming

Biostatistics 615/815 - Lecture 2 Introduction to C++ Programming Biostatistics 615/815 - Lecture 2 Introduction to C++ Programming Hyun Min Kang September 6th, 2012 Hyun Min Kang Biostatistics 615/815 - Lecture 2 September 6th, 2012 1 / 31 Last Lecture Algorithms are

More information

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM Module III Overview of Storage Structures, QP, and TM Sharma Chakravarthy UT Arlington sharma@cse.uta.edu http://www2.uta.edu/sharma base Management Systems: Sharma Chakravarthy Module I Requirements analysis

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams

More information

MySQL B+ tree. A typical B+tree. Why use B+tree?

MySQL B+ tree. A typical B+tree. Why use B+tree? MySQL B+ tree A typical B+tree Why use B+tree? B+tree is used for an obvious reason and that is speed. As we know that there are space limitations when it comes to memory, and not all of the data can reside

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,

More information

Storage System COSC UCB

Storage System COSC UCB Storage System COSC4201 1 1999 UCB I/O and Disks Over the years much less attention was paid to I/O compared with CPU design. As frustrating as a CPU crash is, disk crash is a lot worse. Disks are mechanical

More information

Storage Consolidation with the Dell PowerVault MD3000i iscsi Storage

Storage Consolidation with the Dell PowerVault MD3000i iscsi Storage Storage Consolidation with the Dell PowerVault MD3000i iscsi Storage By Dave Jaffe Dell Enterprise Technology Center and Kendra Matthews Dell Storage Marketing Group Dell Enterprise Technology Center delltechcenter.com

More information

For each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table

For each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table For each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table Common components of a database: Attribute (or item or field)

More information

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CMSC 424 Database design Lecture 12 Storage. Mihai Pop CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

Part I What are Databases?

Part I What are Databases? Part I 1 Overview & Motivation 2 Architectures 3 Areas of Application 4 History Saake Database Concepts Last Edited: April 2019 1 1 Educational Objective for Today... Motivation for using database systems

More information

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections ) Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections 7.1-7.9) 1 Role of I/O Activities external to the CPU are typically orders of magnitude slower Example: while

More information

Access Test Chapters 1 and 2

Access Test Chapters 1 and 2 Access Test Chapters 1 and 2 True/False Indicate whether the statement is true or false. 1. A collection of fields describing a person, place, object, event, or idea is a table. 2. A single set of field

More information

extreme searching: how to avoid extreme frustration and bird walks presented by Kathy Schrock Overview The Problems

extreme searching: how to avoid extreme frustration and bird walks presented by Kathy Schrock Overview The Problems extreme searching: how to avoid extreme frustration and bird walks presented by Kathy Schrock kathy@kathyschrock.net Overview Problems with searching Three main types of search tools The top search engines

More information

Admin. ! Assignment 3. ! due Monday at 11:59pm! one small error in 5b (fast division) that s been fixed. ! Midterm next Thursday in-class (10/1)

Admin. ! Assignment 3. ! due Monday at 11:59pm! one small error in 5b (fast division) that s been fixed. ! Midterm next Thursday in-class (10/1) Admin CS4B MACHINE David Kauchak CS 5 Fall 5! Assignment 3! due Monday at :59pm! one small error in 5b (fast division) that s been fixed! Midterm next Thursday in-class (/)! Comprehensive! Closed books,

More information

Lecture 23. Finish-up buses Storage

Lecture 23. Finish-up buses Storage Lecture 23 Finish-up buses Storage 1 Example Bus Problems, cont. 2) Assume the following system: A CPU and memory share a 32-bit bus running at 100MHz. The memory needs 50ns to access a 64-bit value from

More information

Some Basic Terminology

Some Basic Terminology Some Basic Terminology A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Here are a few terms you'll run into: A Application Files Program files environment where you can create and edit the kind of

More information

CS2630: Computer Organization Homework 1 Bits, bytes, and memory organization Due January 25, 2017, 11:59pm

CS2630: Computer Organization Homework 1 Bits, bytes, and memory organization Due January 25, 2017, 11:59pm CS2630: Computer Organization Homework 1 Bits, bytes, and memory organization Due January 25, 2017, 11:59pm Instructions: Show your work. Correct answers with no work will not receive full credit. Whether

More information

Enabling the Smart Grid through Big Data

Enabling the Smart Grid through Big Data Enabling the Smart Grid through Big Data Paul A. Navrá;l, Ph.D. Manager Scalable Visualiza;on Technologies Texas Advanced Compu;ng Center TACC Booth @ SC12 November 14, 2012 The Age of Big Data Records

More information

CS5112: Algorithms and Data Structures for Applications

CS5112: Algorithms and Data Structures for Applications CS5112: Algorithms and Data Structures for Applications Lecture 4.1: Applications of hashing Ramin Zabih Some figures from Wikipedia/Google image search Administrivia Web site is: https://github.com/cornelltech/cs5112-f18

More information

CAS CS 460/660 Introduction to Database Systems. Fall

CAS CS 460/660 Introduction to Database Systems. Fall CAS CS 460/660 Introduction to Database Systems Fall 2017 1.1 About the course Administrivia Instructor: George Kollios, gkollios@cs.bu.edu MCS 283, Mon 2:30-4:00 PM and Tue 1:00-2:30 PM Teaching Fellows:

More information

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488) Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-

More information

There are some standard operations on arrays that are used a lot. But Java lacks much support for doing these easily.

There are some standard operations on arrays that are used a lot. But Java lacks much support for doing these easily. New Section 4 Page 1 Operations on Arrays 2:51 PM There are some standard operations on arrays that are used a lot. But Java lacks much support for doing these easily. This set of slides provides some

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2016/17 IR Chapter 01 Boolean Retrieval Example IR Problem Let s look at a simple IR problem Suppose you own a copy of Shakespeare

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Data Access Disks and Files DBMS stores information on ( hard ) disks. This

More information

Persistent Storage - Datastructures and Algorithms

Persistent Storage - Datastructures and Algorithms Persistent Storage - Datastructures and Algorithms 1 / 21 L 03: Virtual Memory and Caches 2 / 21 Questions How to access data, when sequential access is too slow? Direct access (random access) file, how

More information

Animations involving numbers

Animations involving numbers 136 Chapter 8 Animations involving numbers 8.1 Model and view The examples of Chapter 6 all compute the next picture in the animation from the previous picture. This turns out to be a rather restrictive

More information

Index Construction Introduction to Information Retrieval INF 141 Donald J. Patterson

Index Construction Introduction to Information Retrieval INF 141 Donald J. Patterson Index Construction Introduction to Information Retrieval INF 141 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Index Construction Overview Introduction Hardware

More information

CSC 101: Lab Manual#9 Machine Language and the CPU (largely based on the work of Prof. William Turkett) Lab due date: 5:00pm, day after lab session

CSC 101: Lab Manual#9 Machine Language and the CPU (largely based on the work of Prof. William Turkett) Lab due date: 5:00pm, day after lab session CSC 101: Lab Manual#9 Machine Language and the CPU (largely based on the work of Prof. William Turkett) Lab due date: 5:00pm, day after lab session Purpose: The purpose of this lab is to gain additional

More information

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)

Announcements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6) CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary

More information

Note. Some History 8/8/2011. TECH 6 Approaches in Network Monitoring ip/f: A Novel Architecture for Programmable Network Visibility

Note. Some History 8/8/2011. TECH 6 Approaches in Network Monitoring ip/f: A Novel Architecture for Programmable Network Visibility TECH 6 Approaches in Network Monitoring ip/f: A Novel Architecture for Programmable Network Visibility Steve McCanne - CTO riverbed Note This presentation is for information purposes only and is not a

More information

CASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone

CASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone CASE STUDY INSURANCE Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone We already had a one-click process for database provisioning, but it was still taking too much

More information

Admin CS41B MACHINE. Midterm topics. Admin 2/11/16. Midterm next Thursday in-class (2/18) SML. recursion. math. David Kauchak CS 52 Spring 2016

Admin CS41B MACHINE. Midterm topics. Admin 2/11/16. Midterm next Thursday in-class (2/18) SML. recursion. math. David Kauchak CS 52 Spring 2016 Admin! Assignment 3! due Monday at :59pm! Academic honesty CS4B MACHINE David Kauchak CS 5 Spring 6 Admin Midterm next Thursday in-class (/8)! Comprehensive! Closed books, notes, computers, etc.! Except,

More information