Project 1 for CMPS 181: Implementing a Paged File Manager

Similar documents
COMP 3500 Introduction to Operating Systems Project 5 Virtual Memory Manager

EE 422C HW 6 Multithreaded Programming

Computer Architecture Assignment 4 - Cache Simulator

Assignment 2, due September 17

Therefore, it is mandatory that you compile, link and run your assignment on a lab unix machine before submitting it.

CS 3114 Data Structures and Algorithms DRAFT Project 2: BST Generic

Buffer Manager: Project 1 Assignment

Note: This is a miniassignment and the grading is automated. If you do not submit it correctly, you will receive at most half credit.

It is academic misconduct to share your work with others in any form including posting it on publicly accessible web sites, such as GitHub.

15-110: Principles of Computing, Spring 2018

Programming Assignment 2

CS 342 Software Design Spring 2018 Term Project Part III Saving and Restoring Exams and Exam Components

Due: 9 February 2017 at 1159pm (2359, Pacific Standard Time)

Black Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files

Tips from the experts: How to waste a lot of time on this assignment

Project 8: Virtual Machine Translator II

CMPSCI 187 / Spring 2015 Sorting Kata

Digital Lab Reports. Guide for 1250 Students

Programming Standards: You must conform to good programming/documentation standards. Some specifics:


[ 8 marks ] Demonstration of Programming Concepts

CSE 373: Homework 1. Queues and Testing Due: April 5th, 11:59 PM to Canvas

Tips from the experts: How to waste a lot of time on this assignment

ECE2049 Embedded Computing in Engineering Design. Lab #0 Introduction to the MSP430F5529 Launchpad-based Lab Board and Code Composer Studio

Lab 1: Silver Dollar Game 1 CSCI 2101B Fall 2018

CS 103 The Social Network

Project 1 Balanced binary

Project 3: B+ Tree Index Manager

************ THIS PROGRAM IS NOT ELIGIBLE FOR LATE SUBMISSION. ALL SUBMISSIONS MUST BE RECEIVED BY THE DUE DATE/TIME INDICATED ABOVE HERE

ENGR 3950U / CSCI 3020U (Operating Systems) Simulated UNIX File System Project Instructor: Dr. Kamran Sartipi

Com S 227 Assignment Submission HOWTO

Ascii Art. CS 1301 Individual Homework 7 Ascii Art Due: Monday April 4 th, before 11:55pm Out of 100 points

CpSc 1111 Lab 9 2-D Arrays

COMP Assignment 1

Project 1: Buffer Manager

ZedBoard Tutorial. EEL 4720/5721 Reconfigurable Computing

EE516: Embedded Software Project 1. Setting Up Environment for Projects

Project 5: Extensions to the MiniJava Compiler

CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis

King Abdulaziz University Faculty of Computing and Information Technology Computer Science Department

Programming Assignment 1

// Initially NULL, points to the dynamically allocated array of bytes. uint8_t *data;

Data Management, Spring 2015: Project #1

CMPSCI 187 / Spring 2015 Implementing Sets Using Linked Lists

Project #1: Tracing, System Calls, and Processes

Project 2: Buffer Manager

CS 1550 Project 3: File Systems Directories Due: Sunday, July 22, 2012, 11:59pm Completed Due: Sunday, July 29, 2012, 11:59pm

ENCE 3241 Data Lab. 60 points Due February 19, 2010, by 11:59 PM

COP Programming Assignment #7

AE Computer Programming for Aerospace Engineers

Project 6: Assembler

Homework Assignment #3

Programming Assignments

CIS192: Python Programming

Our second exam is Thursday, November 10. Note that it will not be possible to get all the homework submissions graded before the exam.

Regis University CC&IS CS310 Data Structures Programming Assignment 2: Arrays and ArrayLists

Note: This is a miniassignment and the grading is automated. If you do not submit it correctly, you will receive at most half credit.

CS3114 (Fall 2013) PROGRAMMING ASSIGNMENT #2 Due Tuesday, October 11:00 PM for 100 points Due Monday, October 11:00 PM for 10 point bonus

3. A Periodic Alarm: intdate.c & sigsend.c

ECE 122 Engineering Problem Solving with Java

Project 5 Handling Bit Arrays and Pointers in C

CS 385 Operating Systems Fall 2011 Homework Assignment 5 Process Synchronization and Communications

Software Design and Analysis for Engineers

Project #1 Exceptions and Simple System Calls

The input of your program is specified by the following context-free grammar:

CS 344/444 Spring 2008 Project 2 A simple P2P file sharing system April 3, 2008 V0.2

CS 1520 / CoE 1520: Programming Languages for Web Applications (Spring 2013) Department of Computer Science, University of Pittsburgh

CSSE2002/7023 The University of Queensland

CMPSCI 187 / Spring 2015 Postfix Expression Evaluator

CS350 : Operating Systems. General Assignment Information

OOP- 4 Templates & Memory Management Print Only Pages 1-5 Individual Assignment Answers To Questions 10 Points - Program 15 Points

Com S 227 Spring 2018 Assignment points Due Date: Thursday, September 27, 11:59 pm (midnight) "Late" deadline: Friday, September 28, 11:59 pm

CSE 12 Week Eight, Lecture Two

Note: This is a miniassignment and the grading is automated. If you do not submit it correctly, you will receive at most half credit.

Organizing a storage hierarchy (creating folders etc).such as: keeping all related files in a sub folder. Reviewers: REF No:

(Provisional) Lecture 08: List recursion and recursive diagrams 10:00 AM, Sep 22, 2017

Problem Set 6 Audio Programming Out of 40 points. All parts due at 8pm on Thursday, November 29, 2012

Deep Dive: Pronto Transformations Reference

6.034 Artificial Intelligence, Fall 2006 Prof. Patrick H. Winston. Problem Set 0

You will work in groups of four for the project (same groups as Project 1).

You just told Matlab to create two strings of letters 'I have no idea what I m doing' and to name those strings str1 and str2.

Pointer Casts and Data Accesses

CSE100 Principles of Programming with C++

Data Structures and Algorithms

CSE 333 Midterm Exam July 24, Name UW ID#

Network Administration/System Administration (NTU CSIE, Spring 2018) Homework #1. Homework #1

Download, Install and Use Winzip

Network Simulator Project Guidelines Introduction

CMPSCI 187 / Spring 2015 Hanoi

Project 4: Implementing Malloc Introduction & Problem statement

gcc o driver std=c99 -Wall driver.c bigmesa.c

15-110: Principles of Computing, Spring 2018

ECE QNX Real-time Lab

Assignment 1: Plz Tell Me My Password

COSC 6385 Computer Architecture. - Homework

COMP 1006/1406 Assignment 1 Carleton Foodorama (Part 1) Due: Friday, July 14 th 2006, before 11h55 pm

Creating a String Data Type in C

Tips from the experts: How to waste a lot of time on this assignment

Introduction to Programming II Winter, 2015 Assignment 5 (Saturday, April 4, 2015: 23:59:59)

Lab 8: Ordered Search Results

Transcription:

Project 1 for CMPS 181: Implementing a Paged File Manager Deadline: Sunday, April 23, 2017, 11:59 pm, on Canvas. Introduction In project 1, you will implement a very simple paged file (PF) manager. It builds up the basic file system required for continuing with projects 2,3 and 4. The PF component provides classes and methods for managing files and pages in files. In addition, you will implement the first few operations for a record-based file (RBF) manager (which you will continue working on in part 2 of the project) on top of your basic paged file system. This document aims at providing you with the necessary information required to start project 1. For all 4 CMPS 181 projects, you will be working in teams of 2 or 3 people, where at least one person on each team is familiar with C++. We ll form teams during the first week of classes, and we hope that teams will last throughout the entire term. Only one person on a team needs to submit each assignment. Other team members will be listed in the project report file. Goal The goal of project 1 is threefold: Get (re)familiar with a C/C++ development environment. Implement a simple paged file system. Implement a few operations of a record-oriented (a.k.a. tuple-oriented) file system. The detailed description of the project itself is in the CMPS181_Project1_Description file. Overview of Steps 1. Download and deploy the codebase of Project 1. 2. Finish the development of Project 1. Detailed Instructions 1. Download and deploy the codebase of Project 1 Download the codebase of Project 1 Please download the codebase.zip file onto your own computer. Unzip the file. Note that almost all of the files are tests that you won t have to modify.

Read the readme.txt under./codebase/. Go to the codebase, and modify the CODEROOT in makefile.inc properly. Go to folder "rbf", and type in: make clean make./rbftest1 You will be able to see the output. 2. Finish the development of Project 1 We have seen the results of running the code in codebase. But, since the implementation of the methods is empty, you cannot manage any files yet. Please finish the implementation in pfm.cc as well as the following methods in rbfm.cc (besides the constructor and destructor): 1) insertrecord. 2) readrecord. 3) printrecord. The remaining methods are not required for part 1 of the project; instead you will implement them as part of part 2 of the project. Please write your own test cases to test your code. You are responsible for anticipating other things that might go wrong that we haven't provided public tests for, just as you would be if you were building a DBMS in an industrial setting. You may find these functions useful: http://www.cplusplus.com/reference/clibrary/cstdio/ Submission Instructions The following are requirements on your submission. Points will likely be deducted if they are not followed. Suppress any debug messages that you put in your code. Only the original messages in the test cases should be printed. Write a short report to briefly describe the implementation (key design choices) of your paged file and record-based file system layers. Please supply your report as a text file named project1_report.txt, not a PDF, Word Document, or other non-text format. Do not use RTF format. Format of the report is described in the project1_report.txt file. All team members should be listed, not just the submitter.

Submit the source code under the "rbf" folder. Make sure you do a "make clean" first, and do NOT include any useless files (such as binary files and data files). You must make sure your makefile runs properly! Please organize your project according to the following directory hierarchy: project1-studentid / codebase / {rbf, makefile.inc, readme.txt, project1_report.txt} where the rbf folder includes your source code and the makefile. (e.g., project1-12345678 / codebase / {rbf, makefile.inc, readme.txt, project1_report.txt}) Compress project1-studentid into a SINGLE zip file named "project1-studentid.zip" (e.g., project1-12345678.zip). Put the test.sh file we provided and the zip file under the same directory. Run the test.sh script to verify that your project can be properly unzipped and tested (using your own makefile.inc and rbftest.cc when you are testing the script). If the script doesn't work correctly, it's likely that your folder organization doesn't meet the above requirements. Our grading will be done by automatically done by running the script. The usage of the script is:./test.sh ''project1-studentid'' IMPORTANT: You may develop your code on your own computer, or on any other system, but you need to make sure that your script works on unix.ucsc.edu, as testing and grading of your submission will be done there. Remember to suppress any debug messages that you put in your code. Only the original messages in the test cases should be printed. Submit the zip file "project1-studentid.zip" to Canvas (e.g., project1-12345678.zip). Only one student on each team needs to do this, but your report should identify all team members, and your team number. That requires that you FTP your zip file from unix to your computer so that you can post it on Canvas. Please ask the course T.A., Ashutosh Raina, for help with that if needed. Testing Please test your code using the test cases inside the codebase. Note that these test cases will be used to grade your project partially, but we also have our own private test cases. This is by no means an exhaustive test suite. You should add more cases to this and test your code thoroughly!

Grading Rubrics Full Credit: 100 points 1. Follow the submission requirements (5%) Code is structured as required (e.g., the directory structure). Makefile works. Code compiles and links correctly. Please make sure that your submission follows these requirements and can be unzipped and built by running the script we provided. Failure to follow the instructions could make your project unable to build on our machines, which may result in loss of points. 2. Project Report (10%) Fill in each section of the report (project1_report.txt) in the codebase directory. Clearly document the design of your project (internal record format and page format). 3. Implementation details (15%) Implement all the required functionalities Implementation should follow the basic requirements of each function, such as file should be created on disk. 4. Pass the provided test suite. (50%) Each of which is graded as pass/fail. There are no partial points for each test case. 5. Pass the private test suite. (15%) There are private tests, each of which is graded as pass/fail. Again, there are no partial points for each test case. 6. Valgrind (5%) You should not have any memory leaks. A valgrind check should not fail.

Q & A Q1: Can we make changes to the makefile provided? A: You have to make sure that we can build your submitted project in command line by using 'make'. If you don't add any new source files to the project, you shouldn't need many modifications to our makefile. Here is a tutorial for the 'make' tool. You can refer to it if you do decide to change the makefile. Q2: Consider a case where Page 3 of the file is full but Page 2 is partially filled and the user wants to append data? Now, if the size of the data that he or she wants to write is more than the available space on Page 2, what is the expected action to be taken? Do we just fit in whatever data we can and truncate the rest, OR completely disallow the user from making such a write? A: AppendPage() always happens to the end of the file, so this scenario can't arise. The number of file bytes affected by each page operation is always PAGE_SIZE. The paged file system layer always deals in pages -- nothing more and nothing less. Q3: Is it fine if I do the file handling in C++ using the binary mode of read/write? A: You should indeed use the binary mode! Q4: Why is the access specifier of the constructor and destructor of the class PagedFileManager set to be "protected"? A: The PagedFileManager is a singleton class, which means only ONE instance of PagedFileManager is allowed. You cannot instantiate the class by calling its constructor. Instead you should get an instance of the class by calling the Instance() function of PagedFileManager. The Instance() function has been implemented for you in pfm.cc. The same applies to the RecordBasedFileManager. Q5: As for pages, if I understand correctly, the Read/Write/AppendPage functions are operating on these files, and if you want to write the 3rd page (page number: 2) of a file, you'd seek 8K bytes into the file and start writing the data. Is this correct, or am I misunderstanding the concept of pages? A: o o o Read reads a page that must already exist. Append adds a new page at the end of the file, increasing the number of pages in the file by one. Append is needed when you re inserting a record into a file, but there s no room for that record in any of the existing pages of the file. Write overwrites a page that must already exist. To write to the 3rd page of a file, then if the 3rd page already exists, you can overwrite it. However, if the 3rd page doesn t exist and the file has 2 pages (page numbers: 0,1) that contain valid data, then you can Append a new page (page number 2), and then Write to that new 3rd page. Please do not leave "holes" in a file by writing past EOF. We won't allow you to append garbage pages that aren t in the file.

Q6: Since I need to change the path of codebase in makefile.inc to test the project, do I need to change it back when I submit the zip file? A: No, you don't need to change back, but you need to instead make sure the path is relative so that the test.sh script will also work on another machine. Q7: When inserting a tuple, do we only have to consider insertion of the new tuple at the end of the last page? Or do we have to be able to support insertion in whatever free space may exist among all the current pages? A: You should first try to insert the record on the last (currently existing) page. If that fails, you should then try to find the first page with sufficient space available (e.g., looking from the beginning of the file). If none exists, then (and only then) should you append a new page to hold the new tuple. Q8: What's the data format for data being passed to insertrecord? A: The API format for insertrecord is as follows: Suppose you have five fields and their types are varchar(20), integer, varchar(20), real, and string. If a record is ("Tom", 25, "UCSantaCruz", 3.1415, 100), then the format of the record should be: [1 byte for the null-indicators for the fields: bit 00000000] [4 bytes for the length 3] [3 bytes for the string "Tom"] [4 bytes for the integer value 25] [4 bytes for the length 11] [11 bytes for the string "UCSantaCruz"] [4 bytes for the float value 3.1415] [4 bytes for the integer value 100]. Note that integer and real type fields do not have an associated length value in front of them; this is because each of these types always occupies 4 bytes. The first part of the input contains n bytes for passing the null information about each of the incoming record's fields. The value n can be calculated by using this formula: ceil(number of fields in a record / 8). For example, in this case, since there are 5 fields, the size of "n" can be calculated by ceil(5/8) = 1. If there are 20 fields, the size will be ceil(20/8) = 3. The left-most bit in the first byte corresponds to the first field. The rightmost bit in the first byte corresponds to the eighth field. If there are more than eight fields, the left-most bit in the second byte corresponds to the ninth field and so on. If a field value is NULL, the corresponding bit in the null bit vector will be set to 1. For example, if we have a record ("Tom", 25, NULL, NULL, 100) whose third attribute and fourth attribute are NULL, the first part contains 00110000 as the bit pattern in one byte. The actual byte representation will be: [1 byte for the null-indicators for the fields: 00110000] [4 bytes for the length 3] [3 bytes for the string "Tom"] [4 bytes for the integer value 25] [4 bytes for the integer value 100]. Note that there are no values to represent NULL values in the actual data. You MUST follow this API format! NOTE: This API data format is just intended for passing data into the insertrecord(). This does not mean that the internal representation of your record should be the same as this format -- in fact, it almost certainly will not be! (On-page record formatting options will be covered in lecture, and your project should make good choices for what it does based on what you learn in class.)

Q9: Can we assume that a record can fit on a page (i.e., the size of a record < the predefined page size)? A: Yes. You can assume that a record can fit on a page.