How They Work. Larry Page (one of the founders of Google) wrote the following in July 1996:
|
|
- Rosamond Peters
- 5 years ago
- Views:
Transcription
1 Databases How They Work Back in 1996 Larry Page (one of the founders of Google) wrote the following in July 1996: I am almost out of disk space. I have downloaded about 24 million unique URLs and about 100 million links I think I will need 8 gigs more to store everything Current retail prices are about $1000/4 gigs I have only about 15% of the pages but it seems promising.
2 Why Databases? Now (2006) there are tens of billions (1 billion = 10^9) of documents on the Internet. How can anyone find a needle in this haystack? Answer: Pointers to documents are stored in databases in the form of keywords so that specific documents can be found and retrieved by using a combination of search terms. Data Storage Background Suppose there are 20 billion (=20x10 9 ) documents and the first 100 characters of each are stored using 1 byte (=8 bits) per character => 20x10 9 x100 = 2x10 12 bytes or 2 trillion (= 2000 billion) bytes of storage are needed. A 200 GB (gigabyte) hard drive costs about $200 => need 10 drives, $2000 which is quite reasonable.
3 Reading Hard Drive Data is written in the form of concentric tracks onto the platters in the hard drive. Each track can hold kbytes (kilo-bytes) and the platters rotate at approx 7200 rpm (=7200/60=120 rev/sec). Assuming 167 kbytes/track, the read rate is 120x167x10 3 =20 MB/s (megabytes/sec). But 200GB/20MB/s = 2x10 11 /2x10 7 = 10 4 sec. Note that sec = 2.78 hours for 200 GB. Direct search/read from hard drives not feasible! Database Organization Desirable properties of a database: Answer queries fast with options to narrow or broaden search results. Easy to update and maintain without introducing inconsistencies. Flexible and easy to expand as data and users evolve. Scalable as size increases (by orders of magnitude, e.g., from millions to billions).
4 Card File of Mini Zoo Name: Habitat: Colors: Speed: Caretaker: (6-8 am, 3-6 pm) Name: Habitat:, Colors: Speed: Caretaker: (10 am - 4 pm) Name: Habitat:, Colors:,, Speed: Caretaker: (10 am - 4 pm) Name: Habitat: Colors:,, Speed: Caretaker: (6-7 am, 3-4 pm) Name: Habitat: Colors: Speed: Caretaker: (10 am - 4 pm) Name: Habitat: Colors:, Red Speed: Caretaker: (6-8 am, 3-6 pm) Name: Habitat: Colors: Yellow Speed: Caretaker: (6-7 am, 3-4 pm) Suppose changes his schedule. Then we need to go through all cards and enter the change. Database Queries We could ask some of the following questions: How many animals are fast? Which animals live in water and are grey? How many animals have as caretaker? Which animals are either green or fast?
5 Card File --> Flat File Suppose we convert the card file to an electronic database on a computer. A straightforward solution is to enter each card into a table that has as many columns as there are features to record. For example, since some animals have 3 colors, we need 3 columns, labeled color1, color2, and color3. Flat File Database # Name Habitat1 Habitat2 Color1 Color2 Color3 Speed Caretaker Yellow Red (6-8 am, 3-6 pm) (10 am - 4 pm) (10 am - 4 pm) (10 am - 4 pm) (6-7 am, 3-4 pm) (6-8 am, 3-6 pm) (6-7 am, 3-4 pm) Fixed number of columns. Wastes space if most animals have one or two colors. Suppose an exotic bird has four colors, or some animals have two caretakers that take turns. This means that the whole database needs to be enlarged.
6 Flat File Database Now we can search electronically, but we still need to look at every record to find, for example, which animals are black. Also, if suddenly we have two caretakers that take turns in caring for some of the animals, we need to add a column to each record in the database. Hierarchical Database It would be nice to be able to find a record quickly by successive refining of the search query. For example, we could ask for all animals that live on land. And then we could ask for all animals that live on land and are fast.
7 Hierarchical Database DBMS 6-7 am 3-4 pm Yellow 6-7 am 3-4 pm Red 6-8 am 3-6 pm 6-8 am 3-6 pm Hierarchical Database But there is still a lot of redundancy. For example, duck is listed twice. What if we just ask How many animals are fast?
8 Network Database (6-7, 3-4) (10-4) (6-8, 3-6) Red Yellow Network Database Has very little redundancy. We can easily ask How many animals does take care of? Gets very complex quickly.
9 Relational Database Speed Table Name Speed Caretaker Table Name Caretaker Habitat Table Name Habitat Color Table Name Color Caretaker Schedule Table Caretaker Schedule Red Yellow 6-7 am 3-4 pm 10 am - 4 pm 6-8 am 3-6 pm Relational Database All information in the data base is represented as values in tables. Each property (e.g., color or caretaker) appears once and only once as a column in a table. Each datum in the database is accessible by a combination of table name, column name, and primary key (e.g., fish, duck) value.
10 After Sorting wrt Column 2 Speed Table Name Speed Caretaker Table Name Caretaker Habitat Table Name Habitat Color Table Name Color Caretaker Schedule Table Caretaker Schedule 6-7 am 6-8 am 10 am - 4 pm 3-4 pm 3-6 pm Red Yellow = {, } = {,, } = {,,,, } AND = {} AND = {, } OR = {,,,,, } Sets, Boolean Operators Set A Set B A AND B A OR B Note: A OR B = A + B -(A AND B)
11 Sets, Boolean Operators Set A NOT A A AND NOT A A OR NOT A Empty set (nothing) Everything (whole database) Truth Tables 0: false, 1: true
Worksheet - Storing Data
Unit 1 Lesson 12 Name(s) Period Date Worksheet - Storing Data At the smallest scale in the computer, information is stored as bits and bytes. In this section, we'll look at how that works. Bit Bit, like
More informationFile System Structure. Kevin Webb Swarthmore College March 29, 2018
File System Structure Kevin Webb Swarthmore College March 29, 2018 Today s Goals Characterizing disks and storage media File system: adding order and structure to storage FS abstractions (files, directories,
More informationDiscussion. Why do we use Base 10?
MEASURING DATA Data (the plural of datum) are anything in a form suitable for use with a computer. Whatever a computer receives as an input is data. Data are raw facts without any clear meaning. Computers
More informationIST346. Data Storage
IST346 Data Storage Data Storage Why Data Storage? Information is a the center of all organizations. Organizations need to store data. Lots of it. What Kinds of Data? Documents and Files (Reports, Proposals,
More informationThe Server-Storage Performance Gap
The Server-Storage Performance Gap How disk drive throughput and access time affect performance November 2010 2 Introduction In enterprise storage configurations and data centers, hard disk drives serve
More informationChapter 18 Indexing Structures for Files
Chapter 18 Indexing Structures for Files Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Disk I/O for Read/ Write Unit for Disk I/O for Read/ Write: Chapter 18 One Buffer for
More informationCSE 373: Data Structures and Algorithms. Memory and Locality. Autumn Shrirang (Shri) Mare
CSE 373: Data Structures and Algorithms Memory and Locality Autumn 2018 Shrirang (Shri) Mare shri@cs.washington.edu Thanks to Kasey Champion, Ben Jones, Adam Blank, Michael Lee, Evan McCarty, Robbie Weber,
More informationStorage Devices for Database Systems
Storage Devices for Database Systems 5DV120 Database System Principles Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Storage Devices for
More informationsecondary storage: Secondary Stg Any modern computer system will incorporate (at least) two levels of storage:
Secondary Storage 1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: random access memory (RAM) typical capacity 256MB to 4GB cost per MB $0.10 typical access
More informationModule 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.
The Lecture Contains: Memory organisation Example of memory hierarchy Memory hierarchy Disks Disk access Disk capacity Disk access time Typical disk parameters Access times file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture4/4_1.htm[6/14/2012
More informationSome Practice Problems on Hardware, File Organization and Indexing
Some Practice Problems on Hardware, File Organization and Indexing Multiple Choice State if the following statements are true or false. 1. On average, repeated random IO s are as efficient as repeated
More informationData Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data
More informationCOMP 102: Computers and Computing
COMP 102: Computers and Computing Lecture 2: Bits&bytes, Switches, and Boolean Logic Instructor: Kaleem Siddiqi (siddiqi@cim.mcgill.ca) Class web page: www.cim.mcgill.ca/~siddiqi/102.html The Lowly Bit
More informationAppendix D: Storage Systems
Appendix D: Storage Systems Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Storage Systems : Disks Used for long term storage of files temporarily store parts of pgm
More informationInformation Systems (Informationssysteme)
Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IX B-Trees c Jens Teubner Information
More informationHadoop and Map-reduce computing
Hadoop and Map-reduce computing 1 Introduction This activity contains a great deal of background information and detailed instructions so that you can refer to it later for further activities and homework.
More informationNOTE: sorting using B-trees to be assigned for reading after we cover B-trees.
External Sorting Chapter 13 (Sec. 13-1-13.5): Ramakrishnan & Gehrke and Chapter 11 (Sec. 11.4-11.5): G-M et al. (R2) OR Chapter 2 (Sec. 2.4-2.5): Garcia-et Molina al. (R1) NOTE: sorting using B-trees to
More informationThe Anatomy of a Large-Scale Hypertextual Web Search Engine
The Anatomy of a Large-Scale Hypertextual Web Search Engine Article by: Larry Page and Sergey Brin Computer Networks 30(1-7):107-117, 1998 1 1. Introduction The authors: Lawrence Page, Sergey Brin started
More informationSearching for Information
Searching for Information The Searching Process-How do I start? When faced with a task that requires you to search for information, it can be quite overwhelming. Here are some important things to think
More informationDisk Scheduling COMPSCI 386
Disk Scheduling COMPSCI 386 Topics Disk Structure (9.1 9.2) Disk Scheduling (9.4) Allocation Methods (11.4) Free Space Management (11.5) Hard Disk Platter diameter ranges from 1.8 to 3.5 inches. Both sides
More information(Refer Slide Time: 01:25)
Computer Architecture Prof. Anshul Kumar Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture - 32 Memory Hierarchy: Virtual Memory (contd.) We have discussed virtual
More informationDesign and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Week 02 Module 06 Lecture - 14 Merge Sort: Analysis
Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute Week 02 Module 06 Lecture - 14 Merge Sort: Analysis So, we have seen how to use a divide and conquer strategy, we
More informationThe personal computer system uses the following hardware device types -
EIT, Author Gay Robertson, 2016 The personal computer system uses the following hardware device types - Input devices Input devices Processing devices Storage devices Processing Cycle Processing devices
More informationReview Web Search. Functions & Functional Abstraction & More Functions. Natural Language and Dialogue Systems Lab
Review Web Search Functions & Functional Abstraction & More Functions Natural Language and Dialogue Systems Lab Web search: It Matters How It Works 1. Gather information. 2. Keep copies. 3. Build an index.
More informationIntroduction to the Mathematics of Big Data. Philippe B. Laval
Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2017 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,
More informationComponents of the Virtual Memory System
Components of the Virtual Memory System Arrows indicate what happens on a lw virtual page number (VPN) page offset virtual address TLB physical address PPN page offset page table tag index block offset
More informationIntroduction to Information Retrieval
Introduction Inverted index Processing Boolean queries Course overview Introduction to Information Retrieval http://informationretrieval.org IIR 1: Boolean Retrieval Hinrich Schütze Institute for Natural
More informationOrganization of a Surface
Organization of a Surface Each disk surface is partitioned into a number of concentric tracks. Each track is partitioned into a number of sectors. Each sector contains 512 bytes of data, plus control information
More informationWhat is a Database? CMPSCI 105: Lecture #15 Introduction to Databases. Spreadsheets vs. Databases. One Skill You Will Develop
What is a Database? CMPSCI 105: Lecture #15 Introduction to Databases 2014 2019 Dr. William T. Verts A Database, like a Spreadsheet, is a way of structuring information in order to solve problems, Unlike
More informationC has been and will always remain on top for performancecritical
Check out this link: http://spectrum.ieee.org/static/interactive-the-top-programminglanguages-2016 C has been and will always remain on top for performancecritical applications: Implementing: Databases
More informationProfessor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!
Professor: Pete Keleher! keleher@cs.umd.edu! } Mechanisms and definitions to work with FDs! Closures, candidate keys, canonical covers etc! Armstrong axioms! } Decompositions! Loss-less decompositions,
More informationToday s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3
EECS 262a Advanced Topics in Computer Systems Lecture 3 Filesystems (Con t) September 10 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering and Computer Sciences University of California,
More informationParadigm Shift of Database
Paradigm Shift of Database Prof. A. A. Govande, Assistant Professor, Computer Science and Applications, V. P. Institute of Management Studies and Research, Sangli Abstract Now a day s most of the organizations
More informationTEN TRAPS FOR ATTORNEYS TO AVOID IN IP SEQUENCE SEARCH AND ANALYSIS
TEN TRAPS FOR ATTORNEYS TO AVOID IN IP SEQUENCE SEARCH AND ANALYSIS The growth of sequence IP is nothing short of amazing! In 2007, we had about 50 million sequences ten years later, we are fast approaching
More informationCOMP 273 Winter physical vs. virtual mem Mar. 15, 2012
Virtual Memory The model of MIPS Memory that we have been working with is as follows. There is your MIPS program, including various functions and data used by this program, and there are some kernel programs
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Announcements ECE Honors Exhibition Wednesday, April
More informationHigh-Performance Storage Systems
High-Performance Storage Systems I/O Systems Processor interrupts Cache Memory - I/O Bus Main Memory I/O Controller I/O Controller I/O Controller Disk Disk Graphics Network 2 Storage Technology Drivers
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More information2. Give an example of algorithm instructions that would violate the following criteria: (a) precision: a =
CSC105, Introduction to Computer Science Exercises NAME DIRECTIONS. Complete each set of problems. Provide answers and supporting work as prescribed I. Algorithms. 1. Write a pseudocoded algorithm for
More informationIntroduction. Table of Contents
Introduction This is an informal manual on the gpu search engine 'gpuse'. There are some other documents available, this one tries to be a practical how-to-use manual. Table of Contents Introduction...
More informationFile system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems
File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Summary of the FS abstraction User's view Hierarchical structure Arbitrarily-sized files Symbolic file names Contiguous address space
More informationComputer Organization and Technology External Memory
Computer Organization and Technology External Memory Assoc. Prof. Dr. Wattanapong Kurdthongmee Division of Computer Engineering, School of Engineering and Resources, Walailak University 1 Magnetic Disk
More informationTop 10 pre-paid SEO tools
Top 10 pre-paid SEO tools Introduction In historical terms, Google regularly updates its search algorithms judging by the previous years. Predictions for the 2016 tell us that the company work process
More informationTracking Rumors. Tracking Rumors. Representing Rumor Mills. Representing Rumor Mills. Suppose that we want to track gossip in a rumor mill
Tracking Rumors Suppose that we want to track gossip in a rumor mill Tracking Rumors Simplifying assumption: each person tells at most two others Representing Rumor Mills Representing Rumor Mills Is a
More informationOverview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM
Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static
More informationHow TokuDB Fractal TreeTM. Indexes Work. Bradley C. Kuszmaul. Guest Lecture in MIT Performance Engineering, 18 November 2010.
6.172 How Fractal Trees Work 1 How TokuDB Fractal TreeTM Indexes Work Bradley C. Kuszmaul Guest Lecture in MIT 6.172 Performance Engineering, 18 November 2010. 6.172 How Fractal Trees Work 2 I m an MIT
More informationFA269 - DIGITAL MEDIA AND CULTURE
FA269 - DIGITAL MEDIA AND CULTURE ST. LAWRENCE UNIVERSITY a. hauber http://blogs.stlawu.edu/digitalmedia DIGITAL TECHNICAL PRIMER INCLUDED HERE ARE THE FOLLOWING TOPICS A. WHAT IS A COMPUTER? B. THE DIFFERENCE
More informationFast Approximations for Analyzing Ten Trillion Cells. Filip Buruiana Reimar Hofmann
Fast Approximations for Analyzing Ten Trillion Cells Filip Buruiana (filipb@google.com) Reimar Hofmann (reimar.hofmann@hs-karlsruhe.de) Outline of the Talk Interactive analysis at AdSpam @ Google Trade
More informationCSCI-GA Database Systems Lecture 8: Physical Schema: Storage
CSCI-GA.2433-001 Database Systems Lecture 8: Physical Schema: Storage Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com View 1 View 2 View 3 Conceptual Schema Physical Schema 1. Create a
More informationIntroduction to I/O. April 30, Howard Huang 1
Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.
More informationData Mining & Data Warehouse
Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?
More informationInformation Retrieval
Introduction to Information Retrieval Information Retrieval and Web Search Lecture 1: Introduction and Boolean retrieval Outline ❶ Course details ❷ Information retrieval ❸ Boolean retrieval 2 Course details
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More information计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部
计算机信息表达 刘志磊天津大学智能与计算学部 Bits & Bytes Bytes & Letters More Bytes Bit ( 位 ) the smallest unit of storage Everything in a computer is 0 s and 1 s Bits why? Computer Hardware Chip uses electricity 0/1 states
More informationPromoting Website CS 4640 Programming Languages for Web Applications
Promoting Website CS 4640 Programming Languages for Web Applications [Jakob Nielsen and Hoa Loranger, Prioritizing Web Usability, Chapter 5] [Sean McManus, Web Design, Chapter 15] 1 Search Engine Optimization
More informationBacking Storage Media
Backing Storage Media Key Words The following words will crop up as part of the following presentation. You should use your notes sheet to log information about them when it is covered. You will be quizzed
More information,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions.
Number Place Names NS-PV 1 Memorizing the most common number place names will be much easier once you recognize how the pattern of tens and hundreds names are repeated as prefixes for each group of three
More information,,,, Number Place Names. tens hundreds. ten thousands hundred thousands. ten trillions. hundred millions. ten billions hundred billions.
Number Place Names NS-PV Memorizing the most common number place names will be much easier once you recognize how the pattern of tens and hundreds names are repeated as prefixes for each group of three
More informationRed-Black trees are usually described as obeying the following rules :
Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for
More informationLecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1
Lecture 29 Reminder: Homework 7 is due on Monday at class time for Exam 2 review; no late work accepted. Reminder: Exam 2 is on Wednesday. Exam 2 review sheet is posted. Questions? Friday, March 23 CS
More informationComputers 101. Lecture 1
Computers 101 Lecture 1 Announcements Open House/Software install help session TODAY! Sitterson 008: 1pm 6pm Get help installing course software and meet the UTA team Have this done by Wednesday 1/17 Office
More informationElementary IR: Scalable Boolean Text Search. (Compare with R & G )
Elementary IR: Scalable Boolean Text Search (Compare with R & G 27.1-3) Information Retrieval: History A research field traditionally separate from Databases Hans P. Luhn, IBM, 1959: Keyword in Context
More informationLecture 12. Lecture 12: The IO Model & External Sorting
Lecture 12 Lecture 12: The IO Model & External Sorting Lecture 12 Today s Lecture 1. The Buffer 2. External Merge Sort 2 Lecture 12 > Section 1 1. The Buffer 3 Lecture 12 > Section 1 Transition to Mechanisms
More informationReview question: Protection and Security *
OpenStax-CNX module: m28010 1 Review question: Protection and Security * Duong Anh Duc This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Review question
More informationSegmentation with Paging. Review. Segmentation with Page (MULTICS) Segmentation with Page (MULTICS) Segmentation with Page (MULTICS)
Review Segmentation Segmentation Implementation Advantage of Segmentation Protection Sharing Segmentation with Paging Segmentation with Paging Segmentation with Paging Reason for the segmentation with
More informationThe Transition to Networked Storage
The Transition to Networked Storage Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary... 3 2.0 The Emergence of the Storage Area Network... 3 3.0 The Link Between Business
More informationI/O and file systems. Dealing with device heterogeneity
I/O and file systems Abstractions provided by operating system for storage devices Heterogeneous -> uniform One/few storage objects (disks) -> many storage objects (files) Simple naming -> rich naming
More informationBiostatistics 615/815 - Lecture 2 Introduction to C++ Programming
Biostatistics 615/815 - Lecture 2 Introduction to C++ Programming Hyun Min Kang September 6th, 2012 Hyun Min Kang Biostatistics 615/815 - Lecture 2 September 6th, 2012 1 / 31 Last Lecture Algorithms are
More informationUser Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM
Module III Overview of Storage Structures, QP, and TM Sharma Chakravarthy UT Arlington sharma@cse.uta.edu http://www2.uta.edu/sharma base Management Systems: Sharma Chakravarthy Module I Requirements analysis
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams
More informationMySQL B+ tree. A typical B+tree. Why use B+tree?
MySQL B+ tree A typical B+tree Why use B+tree? B+tree is used for an obvious reason and that is speed. As we know that there are space limitations when it comes to memory, and not all of the data can reside
More informationFile system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems
File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,
More informationStorage System COSC UCB
Storage System COSC4201 1 1999 UCB I/O and Disks Over the years much less attention was paid to I/O compared with CPU design. As frustrating as a CPU crash is, disk crash is a lot worse. Disks are mechanical
More informationStorage Consolidation with the Dell PowerVault MD3000i iscsi Storage
Storage Consolidation with the Dell PowerVault MD3000i iscsi Storage By Dave Jaffe Dell Enterprise Technology Center and Kendra Matthews Dell Storage Marketing Group Dell Enterprise Technology Center delltechcenter.com
More informationFor each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table
For each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table Common components of a database: Attribute (or item or field)
More informationCMSC 424 Database design Lecture 12 Storage. Mihai Pop
CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number
More informationI/O CANNOT BE IGNORED
LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.
More informationPart I What are Databases?
Part I 1 Overview & Motivation 2 Architectures 3 Areas of Application 4 History Saake Database Concepts Last Edited: April 2019 1 1 Educational Objective for Today... Motivation for using database systems
More informationLecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )
Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections 7.1-7.9) 1 Role of I/O Activities external to the CPU are typically orders of magnitude slower Example: while
More informationAccess Test Chapters 1 and 2
Access Test Chapters 1 and 2 True/False Indicate whether the statement is true or false. 1. A collection of fields describing a person, place, object, event, or idea is a table. 2. A single set of field
More informationextreme searching: how to avoid extreme frustration and bird walks presented by Kathy Schrock Overview The Problems
extreme searching: how to avoid extreme frustration and bird walks presented by Kathy Schrock kathy@kathyschrock.net Overview Problems with searching Three main types of search tools The top search engines
More informationAdmin. ! Assignment 3. ! due Monday at 11:59pm! one small error in 5b (fast division) that s been fixed. ! Midterm next Thursday in-class (10/1)
Admin CS4B MACHINE David Kauchak CS 5 Fall 5! Assignment 3! due Monday at :59pm! one small error in 5b (fast division) that s been fixed! Midterm next Thursday in-class (/)! Comprehensive! Closed books,
More informationLecture 23. Finish-up buses Storage
Lecture 23 Finish-up buses Storage 1 Example Bus Problems, cont. 2) Assume the following system: A CPU and memory share a 32-bit bus running at 100MHz. The memory needs 50ns to access a 64-bit value from
More informationSome Basic Terminology
Some Basic Terminology A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Here are a few terms you'll run into: A Application Files Program files environment where you can create and edit the kind of
More informationCS2630: Computer Organization Homework 1 Bits, bytes, and memory organization Due January 25, 2017, 11:59pm
CS2630: Computer Organization Homework 1 Bits, bytes, and memory organization Due January 25, 2017, 11:59pm Instructions: Show your work. Correct answers with no work will not receive full credit. Whether
More informationEnabling the Smart Grid through Big Data
Enabling the Smart Grid through Big Data Paul A. Navrá;l, Ph.D. Manager Scalable Visualiza;on Technologies Texas Advanced Compu;ng Center TACC Booth @ SC12 November 14, 2012 The Age of Big Data Records
More informationCS5112: Algorithms and Data Structures for Applications
CS5112: Algorithms and Data Structures for Applications Lecture 4.1: Applications of hashing Ramin Zabih Some figures from Wikipedia/Google image search Administrivia Web site is: https://github.com/cornelltech/cs5112-f18
More informationCAS CS 460/660 Introduction to Database Systems. Fall
CAS CS 460/660 Introduction to Database Systems Fall 2017 1.1 About the course Administrivia Instructor: George Kollios, gkollios@cs.bu.edu MCS 283, Mon 2:30-4:00 PM and Tue 1:00-2:30 PM Teaching Fellows:
More informationEfficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)
Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-
More informationThere are some standard operations on arrays that are used a lot. But Java lacks much support for doing these easily.
New Section 4 Page 1 Operations on Arrays 2:51 PM There are some standard operations on arrays that are used a lot. But Java lacks much support for doing these easily. This set of slides provides some
More informationInformation Retrieval and Organisation
Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2016/17 IR Chapter 01 Boolean Retrieval Example IR Problem Let s look at a simple IR problem Suppose you own a copy of Shakespeare
More informationStoring Data: Disks and Files
Storing Data: Disks and Files Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Data Access Disks and Files DBMS stores information on ( hard ) disks. This
More informationPersistent Storage - Datastructures and Algorithms
Persistent Storage - Datastructures and Algorithms 1 / 21 L 03: Virtual Memory and Caches 2 / 21 Questions How to access data, when sequential access is too slow? Direct access (random access) file, how
More informationAnimations involving numbers
136 Chapter 8 Animations involving numbers 8.1 Model and view The examples of Chapter 6 all compute the next picture in the animation from the previous picture. This turns out to be a rather restrictive
More informationIndex Construction Introduction to Information Retrieval INF 141 Donald J. Patterson
Index Construction Introduction to Information Retrieval INF 141 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Index Construction Overview Introduction Hardware
More informationCSC 101: Lab Manual#9 Machine Language and the CPU (largely based on the work of Prof. William Turkett) Lab due date: 5:00pm, day after lab session
CSC 101: Lab Manual#9 Machine Language and the CPU (largely based on the work of Prof. William Turkett) Lab due date: 5:00pm, day after lab session Purpose: The purpose of this lab is to gain additional
More informationAnnouncements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)
CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary
More informationNote. Some History 8/8/2011. TECH 6 Approaches in Network Monitoring ip/f: A Novel Architecture for Programmable Network Visibility
TECH 6 Approaches in Network Monitoring ip/f: A Novel Architecture for Programmable Network Visibility Steve McCanne - CTO riverbed Note This presentation is for information purposes only and is not a
More informationCASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone
CASE STUDY INSURANCE Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone We already had a one-click process for database provisioning, but it was still taking too much
More informationAdmin CS41B MACHINE. Midterm topics. Admin 2/11/16. Midterm next Thursday in-class (2/18) SML. recursion. math. David Kauchak CS 52 Spring 2016
Admin! Assignment 3! due Monday at :59pm! Academic honesty CS4B MACHINE David Kauchak CS 5 Spring 6 Admin Midterm next Thursday in-class (/8)! Comprehensive! Closed books, notes, computers, etc.! Except,
More information