Cosc 242 Assignment. Due: 4pm Friday September 15 th 2017

Similar documents
COSC242 - Practical Test III

CptS 360 (System Programming) Unit 1: Introduction to System Programming

Lab Exam 1 D [1 mark] Give an example of a sample input which would make the function

Creating a String Data Type in C

CS354 gdb Tutorial Written by Chris Feilbach

Recitation: C Review. TA s 20 Feb 2017

PRINCIPLES OF OPERATING SYSTEMS

Recitation: Cache Lab & C

6. Pointers, Structs, and Arrays. March 14 & 15, 2011

CSCI 4963/6963 Large-Scale Programming and Testing Homework 1 (document version 1.0) Regular Expressions and Pattern Matching in C

CpSc 1011 Lab 5 Conditional Statements, Loops, ASCII code, and Redirecting Input Characters and Hurricanes

Lab 4: Tracery Recursion in C with Linked Lists

15-213/18-213/15-513, Fall 2017 C Programming Lab: Assessing Your C Programming Skills

CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Project 1 (document version 1.3) Process Simulation Framework

Computer Science & Information Technology (CS) Rank under AIR 100. Examination Oriented Theory, Practice Set Key concepts, Analysis & Summary

Outline. Computer programming. Debugging. What is it. Debugging. Hints. Debugging

15-213/18-213/15-513, Spring 2018 C Programming Lab: Assessing Your C Programming Skills

Programming Assignment Multi-Threading and Debugging 2

CMSC 201 Fall 2016 Homework 6 Functions

CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Homework 3 (document version 1.2) Multi-threading in C using Pthreads

COSC 2P91. Introduction Part Deux. Week 1b. Brock University. Brock University (Week 1b) Introduction Part Deux 1 / 14

CSE 333 Midterm Exam 7/29/13

Lab Exam 1 D [1 mark] Give an example of a sample input which would make the function

Programming Assignment #4 Arrays and Pointers

Hello, World! in C. Johann Myrkraverk Oskarsson October 23, The Quintessential Example Program 1. I Printing Text 2. II The Main Function 3

Review: C Strings. A string in C is just an array of characters. Lecture #4 C Strings, Arrays, & Malloc

6. Pointers, Structs, and Arrays. 1. Juli 2011

My malloc: mylloc and mhysa. Johan Montelius HT2016

CS113: Lecture 7. Topics: The C Preprocessor. I/O, Streams, Files

CSE 333 Midterm Exam 5/10/13

Consider the above code. This code compiles and runs, but has an error. Can you tell what the error is?

CS61C : Machine Structures

CSE 374 Programming Concepts & Tools

Introduction to C++ with content from

Recitation 2/18/2012

HW1 due Monday by 9:30am Assignment online, submission details to come

CS 237 Meeting 19 10/24/12

Procedures, Parameters, Values and Variables. Steven R. Bagley

File Handling in C. EECS 2031 Fall October 27, 2014

Binghamton University. CS-211 Fall Variable Scope

Systems Programming/ C and UNIX

CpSc 1111 Lab 6 Conditional Statements, Loops, the Math Library, and Random Numbers What s the Point?

Lab 5 - Linked Lists Git Tag: Lab5Submission

Binghamton University. CS-211 Fall Variable Scope

Carnegie Mellon. Cache Lab. Recitation 7: Oct 11 th, 2016

Lab 4. Out: Friday, February 25th, 2005

Computer Science 330 Operating Systems Siena College Spring Lab 5: Unix Systems Programming Due: 4:00 PM, Wednesday, February 29, 2012

Intermediate Programming, Spring 2017*

King Abdulaziz University Faculty of Computing and Information Technology Computer Science Department

So far, system calls have had easy syntax. Integer, character string, and structure arguments.

CSCi 4061: Intro to Operating Systems Spring 2017 Instructor: Jon Weissman Assignment 1: Simple Make Due: Feb. 15, 11:55 pm

Tips from the experts: How to waste a lot of time on this assignment

Project #1: Tracing, System Calls, and Processes

de facto standard C library Contains a bunch of header files and APIs to do various tasks

CMPSCI 187 / Spring 2015 Hangman

CpSc 111 Lab 5 Conditional Statements, Loops, the Math Library, and Redirecting Input

Design. Kurt Schmidt. May 10, Examples are taken from Kernighan & Pike, The Practice of Programming, Addison-Wesley, 1999

gcc o driver std=c99 -Wall driver.c bigmesa.c

Welcome to... CS113: Introduction to C

C++ for Java Programmers

Programming Assignment #3 Event Driven Simulation

Project 1 Balanced binary

CSE 374 Midterm Exam Sample Solution 2/11/13. Question 1. (8 points) What does each of the following bash commands do?

Review! * follows a pointer to its value! & gets the address of a variable! Pearce, Summer 2010 UCB! ! int x = 1000; Pearce, Summer 2010 UCB!

This homework is due by 11:59:59 PM on Thursday, October 12, 2017.

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

CS61C : Machine Structures

Hands on Assignment 1

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

Princeton University COS 333: Advanced Programming Techniques A Subset of C90

CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Homework 4 (document version 1.1) Sockets and Server Programming using C

// Initially NULL, points to the dynamically allocated array of bytes. uint8_t *data;

Operating Systems and Networks Project 1: Reliable Transport

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

QUIZ. What is wrong with this code that uses default arguments?

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Lecture 03 Bits, Bytes and Data Types

Dynamic memory allocation

Recitation #11 Malloc Lab. November 7th, 2017

CSE 333 Midterm Exam 7/27/15 Sample Solution

CS61C : Machine Structures

CS 1803 Pair Homework 4 Greedy Scheduler (Part I) Due: Wednesday, September 29th, before 6 PM Out of 100 points

CS342 - Spring 2019 Project #3 Synchronization and Deadlocks

CS 3013 Operating Systems WPI, A Term Assigned: Friday, August 31, 2007 Due: Monday, September 17, 2007

Introduction to C++ Introduction. Structure of a C++ Program. Structure of a C++ Program. C++ widely-used general-purpose programming language

CSE 333 Midterm Exam Sample Solution 5/10/13

CS355 Hw 4. Interface. Due by the end of day Tuesday, March 20.

Each line will contain a string ("even" or "odd"), followed by one or more spaces, followed by a nonnegative integer.

CpSc 1111 Lab 4 Part a Flow Control, Branching, and Formatting

Exercise Session 2 Systems Programming and Computer Architecture

High Performance Computing and Programming, Lecture 3

COSC 2P91. Bringing it all together... Week 4b. Brock University. Brock University (Week 4b) Bringing it all together... 1 / 22

Grades. Notes (by question) Score Num Students Approx Grade A 90s 7 A 80s 20 B 70s 9 C 60s 9 C

Programming Standards: You must conform to good programming/documentation standards. Some specifics:

CSCI-243 Exam 2 Review February 22, 2015 Presented by the RIT Computer Science Community

Lecture 13 Hash Dictionaries

CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis

************ THIS PROGRAM IS NOT ELIGIBLE FOR LATE SUBMISSION. ALL SUBMISSIONS MUST BE RECEIVED BY THE DUE DATE/TIME INDICATED ABOVE HERE

AIMS Embedded Systems Programming MT 2017

Topic 6: A Quick Intro To C. Reading. "goto Considered Harmful" History

Transcription:

Cosc 242 Assignment Due: 4pm Friday September 15 th 2017 Group work For this assignment we require you to work in groups of three people. You may select your own group and inform us of your choice via email by August 16 th. Send an email to ihewson@cs.otago.ac.nz letting us know the names and University user codes of all students in your group. Any students who don t select their own group will be assigned to one and informed via email on August 18 th. Groups will be assigned in a pseudo-random manner (students will be teamed up with others who have completed a similar amount of internal assessment). The related labs, as well as this assignment should be done working in your groups. You must not collaborate with, or discuss issues related to the assignment with anyone who is not a member of your group. Overview In this assignment you will combine some of the code written in the labs to produce a larger program. This program will be used to process two groups of words. The first group will be read from a file specified on the command line. The second group will be read from stdin. If any word read from stdin is not contained in the file then it will get printed to stdout. A moment of reflection will confirm that this program could be used as a rudimentary spellchecker. Running your program with a command like this:./asgn dictionary.txt < document.txt should print out a list of every word from document.txt which is not found in dictionary.txt. If there is no output then document.txt has no misspelled words (as defined in dictionary.txt). Basic outline The data structure that you will use to hold the words is a hash table. Each position in the hash table will be a container which can hold multiple items (i.e. chaining). You will create two types of container, one using a very simple chaining method, and one using a more robust chaining method. The idea behind this is to make sure that your hash table won t be too slow 1

to search, even if the hash function performs poorly. If your hash table size is reasonable and you have a decent hash function then the simple chaining method will work fine. However, if your hash function hashes lots of items to the same position then the hash table will become too slow for practical use. You should use the hash function given in lab 12 for this assignment. We can simulate a bad hash function simply by reducing the size of the hash table. This will ensure that lots of items hash to the same position in the table. Your program will create a hash table and fill it with words read from a file specified on the command line. Your hash table will be made up of containers which can hold any amount of items. A container is really just a wrapper for a flexarray or a red-black tree. So, for example, adding something to a container will either append it to the underlying flexarray, or insert it into the underlying red-black tree. Once all of the words in the file have been put into the hash table, words get read from stdin. If a word read from stdin is not found in the hash table then it gets printed to stdout, otherwise nothing happens. When there are no more words to read from stdin your program finishes. Specific requirements In order to complete this assignment successfully there are a number of requirements you will need to meet, including the following: The default size of the hash table should be 3877. All memory allocated should be deallocated before your program finishes. Change a copy of your flexarray from lab 10 so that it works with strings instead of integers. Add an is_present function, and a visit function (this should take a function as a parameter which is used to print each item in your flexarray in a similar manner to the RBT preorder function). Words should be read using the getword() function from the lab book. Helper functions like getword and emalloc should be in your mylib.c file. Modify your RBT to make it possible to add duplicate words. You may have noticed that the RBT you implemented in labs doesn t ensure that the root is always black. This doesn t affect the structure of the tree at all, but you should try to fix this problem in your program. When printing out containers from your hash table you should print the contents of each non-empty container on a single line with each item separated by a space. You should also print the index of the container in the hash table at the start of the line. If the container type is an RBT then the items should be printed in preorder. 2

The container type should be defined by an enumerated type declaration in container.h like this: typedef enum container_e {FLEX_ARRAY, RED_BLACK_TREE container_t; Your program should take a number of options on the command line to alter its behaviour as specified in this table. Option Action performed -r Use a robust chaining method. This tells your hash table to use a red-black tree as its container type. -s table-size Use table-size as the size of your hash table. You can assume that table-size will be a number greater than 0. -p Print your hash table to stdout one line per non-empty container. The index of the container in the hash table should be printed at the start of the line. Don t read anything from stdin or print anything else out when this option is used. -i Print information (to stderr) about how long it took to fill the hash table, how long it took to search the hash table, and how many unknown words were found like this: Fill time : 1.390000 Search time : 0.450000 Unknown words : 8690 -h Print a help message to stderr describing how to use the program and then exit Create a "container" data structure which is just a wrapper for a flexarray or a red-black tree. It should contain a struct which looks like this: struct containerrec { container_t type; void *contents; ; Each of the functions in container.c will usually just call the appropriate function in either the flexarray or red-black tree. e.g. void container_add(container c, char *word) { if (c->type == RED_BLACK_TREE) { c->contents = rbt_insert(c->contents, word); else { flexarray_append(c->contents, word); Use the getopt library to help you process the options given on the command line. Here is an example of how to use it: 3

const char *optstring = "ab:c"; char option; while ((option = getopt(argc, argv, optstring))!= EOF) { switch (option) { case a : /* do something */ case b : /* the argument after the -b is available in the global variable optarg */ case c : /* do something else */ default: /* if an unknown option is given */ You need to include getopt.h to use the getopt library. The letters listed in optstring are possible valid options. The colon following the letter b indicates that b takes an argument. As the options are being processed by getopt, they get shifted to the front of the argv array. After processing, the index of the first non-option argument is available in the global variable optind. For more information have a look at the man page for getopt.h (type man 3 getopt). If -h (or an invalid option) is given as a command-line argument then a usage message should be printed to stderr and your program should exit. Submission In order to submit your assignment files open a terminal and change into the directory which contains all of your files. Type the command asgn-submit and press return. You should see this list of files printed. asgn.c flexarray.h mylib.h container.c htable.c rbt.c container.h htable.h rbt.h flexarray.c mylib.c If the file list looks correct then just press return, and you will be asked to rate the contribution of the members of your group. Choose the appropriate rating and you should then see the message: Submission complete. Your assignment should be submitted before 4pm on the due date. All group members should submit a copy of the assignment 4

Marking This assignment is worth 15% of your final mark for Cosc 242. A lot of the code will have already been written as you completed labs during the course. If you have completed all of the labs and are careful to write clean, well-commented code which meets the specification you can get full marks. Marks are awarded for both implementation and style (although it should be noted that it is very bad to style to have an implementation that doesn t work). Allocation of marks Implementation 9 Style/Readability 6 Total 15 In order to maximise your marks please take note of the following points: Your code should compile without warnings on the Linux lab machines using the command: gcc -O2 -W -Wall -ansi -pedantic -lm *.c -o asgn If your code does not compile, it is a considered to be a very, very bad thing! Your program should use good C layout as demonstrated in the lab book. No line should be more than 80 characters long. Most of your comments should be in your function headers. A function header should include: A description of what the function does. A description of all the parameters passed to it. A description of the return value if there is one. Any special notes. Any assignments submitted after the due date and time will lose marks at a rate of 10% per day late. You should not discuss issues pertaining to the assignment with anyone not in your group. All programs will be checked for similarity. Part of this assignment involves you clarifying exactly what your program is required to do. Don t make assumptions, only to find out that they were incorrect when your assignment gets marked. You should check your Computer Science email regularly, since email clarifications may be sent to the class email list. If you have any questions about this assignment, or the way it will be assessed, please see Iain or send an email to ihewson@cs.otago.ac.nz. 5