Rules and constraints in software construction PROGRAMMING STYLES

Similar documents
Tutorial 2: Validating Documents with DTDs

Part 1 (80 points) Multiple Choice Questions (20 questions * 4 points per question = 80 points)

Information Retrieval

Pointers, References and Arrays

INF 212 ANALYSIS OF PROG. LANGS. INTERACTIVITY. Prof. Crista Lopes

PLT Fall Shoo. Language Reference Manual

Introduction to MySQL /MariaDB and SQL Basics. Read Chapter 3!

CSCI 2101 Java Style Guide

Chapter 10: MySQL & PHP. PHP and MySQL CIS 86 Mission College

LECTURE 21. Database Interfaces

processing data with a database

Declaration Syntax. Declarations. Declarators. Declaration Specifiers. Declaration Examples. Declaration Examples. Declarators include:

Writing Beautiful Code. Anand Chitipothu

Final Exam Practice. Partial credit will be awarded.

Introduction to Computer Science Midterm 3 Fall, Points

Metadata for Component Optimisation

Entity Relationship Diagram (ERD) Dr. Moustafa Elazhary

Information Retrieval and Organisation

Alastair Burt Andreas Eisele Christian Federmann Torsten Marek Ulrich Schäfer. October 6th, Universität des Saarlandes. Introduction to Python

An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type.

CS150 - Sample Final

Introduction to Computer Science I Spring 2010 Sample mid-term exam Answer key

Chapter 4. In this chapter, you will learn:

2.4 Choose method names carefully

Spring 2008 June 2, 2008 Section Solution: Python

SQL Interview Questions

Formal Languages and Compilers Lecture VI: Lexical Analysis

x = 3 * y + 1; // x becomes 3 * y + 1 a = b = 0; // multiple assignment: a and b both get the value 0

Spreadsheets. With Microsoft Excel as an example

Publishing Conventions

Course Text. Course Description. Course Objectives. StraighterLine Introduction to Programming in C++

Kingdom of Saudi Arabia Princes Nora bint Abdul Rahman University College of Computer Since and Information System CS242 ARRAYS

CSc 372 Comparative Programming Languages

1) Introduction to SQL

XQ: An XML Query Language Language Reference Manual

A Simple Search Engine

This book is about using Visual Basic for Applications (VBA), which is a

Student Performance Q&A:

CSc 372 Comparative Programming Languages

Documentation of the UJAC print module's XML tag set.

ANSWERS. Birkbeck (University of London) Software and Programming 1 In-class Test Feb Student Name Student Number. Answer all questions

Administrative. Distributed indexing. Index Compression! What I did last summer lunch talks today. Master. Tasks

Chapter 6: Information Retrieval and Web Search. An introduction

Databases. Course October 23, 2018 Carsten Witt

Introduction to Computer Science with Python Course Syllabus

Entity-Relationship Model &

Web Information Retrieval. Lecture 4 Dictionaries, Index Compression

8. Characters and Arrays

Microsoft Word. Part 2. Hanging Indent

Conditionals !

Programming Paradigms Written Exam (6 CPs)

PRACTICE MIDTERM EXAM #2

COMP3 (JUN13COMP301) General Certificate of Education Advanced Level Examination June 2013

POSTGRESQL - PYTHON INTERFACE

St. Edmund Preparatory High School Brooklyn, NY

CSE143X: Computer Programming I & II Programming Assignment #9 due: Monday, 11/27/17, 11:00 pm

Chapter # 4 Entity Relationship (ER) Modeling

Programming Lecture 3

Overview of C. Basic Data Types Constants Variables Identifiers Keywords Basic I/O

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

Unstructured Data. CS102 Winter 2019

CPE Summer 2015 Exam I (150 pts) June 18, 2015

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

AP Computer Science A

Contextual Search using Cognitive Discovery Capabilities

Data Modelling: 1 Entity, Keys & Attributes

Programming Languages and Techniques (CIS120)

Recap: Functions as first-class values

CS Introduction to Computational and Data Science. Instructor: Renzhi Cao Computer Science Department Pacific Lutheran University Spring 2017

COURSE OF STUDY UNIT PLANNING GUIDE COMPUTER SCIENCE 1 FOR: 5 CREDITS GRADE LEVEL: 9-12 FULL-YEAR COURSE PREPARED BY: SUSIE EISEN

CSE au Final Exam Sample Solution

CPE 112 Spring 2015 Exam II (100 pts) March 4, Definition Matching (8 Points)

CSC Web Programming. Introduction to SQL

INF 212 ANALYSIS OF PROG. LANGS PLUGINS. Instructors: Crista Lopes Copyright Instructors.

CS 115 Exam 3, Fall 2016, Sections 5-8

CS 1803 Spring 2011 February 14, 2011 Exam 1 Name: Section: Grading TA:

Python and SQLite. COMP Lecture 25 March 26th, 2012 Mathieu Perreault. Monday, 26 March, 12

Lecture 27. Lecture 27: Regular Expressions and Python Identifiers

CS201 Discussion 7 MARKOV AND RECURSION

US Secret Service National Threat Assessment Center (NTAC), Insider threat study (2004)

AP Computer Science A

CSC324 Principles of Programming Languages

Practice problems Set 2

How to declare an array in C?

General Instructions. You can use QtSpim simulator to work on these assignments.

Preprocessor Directives

AP Computer Science A

Structure and Interpretation of Computer Programs

PESIT-BSC Department of Science & Humanities

The Anatomy of a Large-Scale Hypertextual Web Search Engine

CSE 373 Spring 2010: Midterm #1 (closed book, closed notes, NO calculators allowed)

Table of Contents. PDF created with FinePrint pdffactory Pro trial version

Co. Cavan VEC. Co. Cavan VEC. Programme Module for. Word Processing. leading to. Level 5 FETAC. Word Processing 5N1358. Word Processing 5N1358

Duplicate Detection addon for Dynamics CRM by Cowia

Creating and Running a Report

Information Retrieval

Introduction to MapReduce. Adapted from Jimmy Lin (U. Maryland, USA)

1 Fall 2017 CS 1110/1111 Exam 3

CS 115 Exam 3, Fall 2016, Sections 1-4

Midterm 2: CS186, Spring 2015

Transcription:

Rules and constraints in software construction PROGRAMMING STYLES

Programming Styles Ways of expressing tasks Exist and recur at all scales Frozen in Programming Languages

Why Are Styles Important? Basic frames of reference for solutions Like scaffolding Common vocabularies It s a cultural thing Some better than others Depending on many objectives

Programming Styles How do you teach this?

Raymond Queneau

Queneau s Exercises in Style Metaphor Surprises Dream Prognostication Hesitation Precision Negativities Asides Anagrams Logical analysis Past Present (99)

Queneau s Styles Constraints A Void (La Disparition) by Georges Perec No letter e

Exercises in Programming Style The story: Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency

Exercises in Programming Style The story: Pride and Prejudice Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency TF mr - 786 elizabeth - 635 very - 488 darcy - 418 such - 395 mrs - 343 much - 329 more - 327 bennet - 323 bingley - 306 jane - 295 miss - 283 one - 275 know - 239 before - 229 herself - 227 though - 226 well - 224 never - 220

@cristalopes #style1 name STYLE #1

# the global list of [word, frequency] pairs word_freqs = [] # the list of stop words with open('../stop_words.txt') as f: stop_words = f.read().split(',') stop_words.extend(list(string.ascii_lowercase))

for line in open(sys.argv[1]): for c in line:

Style #1 constraints No abstractions No use of library functions @cristalopes #style1 name

Style #1 constraints No abstractions No use of library functions Monolithic Style @cristalopes #style1 name

@cristalopes #style2 name STYLE #2

import re, sys, collections stopwords = set(open('../stop_words.txt').read().split(',')) words = re.findall('[a-z]{2,}', open(sys.argv[1]).read().lower()) counts = collections.counter(w for w in words if w not in stopwords) for (w, c) in counts.most_common(25): print w, '-', c Credit: Laurie Tratt, Kings College London

import re, sys, collections stopwords=set(open('../stop_words.txt').read().split(',')) words = re.findall('[a-z]{2,}', open(sys.argv[1]).read().lower()) counts = collections.counter(w for w in words \ if w not in stopwords) for (w, c) in counts.most_common(25): print w, '-', c

import re, string, sys stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase)) words = [x.lower() for x in re.split("[^a-za-z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops] unique_words = list(set(words)) unique_words.sort(lambda x,y:cmp(words.count(y), words.count(x))) print "\n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]])

Style #2 constraints As few lines of code as possible @cristalopes #style2 name

Style #2 constraints As few lines of code as possible Code Golf Style @cristalopes #style2 name

Style #2 constraints As few lines of code as possible Try Hard Style @cristalopes #style2 name

@cristalopes #style3 name STYLE #3

data=[] words=[] freqs=[] def frequencies(): def read_file(path): def sort(): def filter_normalize(): def scan(): def rem_stop_words(): # # Main # read_file(sys.argv[1]) filter_normalize() scan() rem_stop_words() frequencies() sort() for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1]

Style #3 constraints Procedural abstractions maybe input, no output Shared state Larger problem solved by applying procedures, one after the other, changing the shared state @cristalopes #style3 name

Style #3 constraints Procedural abstractions maybe input, no output Shared state Series of commands Cook Book Style @cristalopes #style3 name

@cristalopes #style4 name STYLE #4

def read_file(path): return... def filter(str_data): return... def normalize(str_data): return... def scan(str_data): return... def sort(word_freqs): # # Main # return... return... wfreqs=st(fq(r(sc(n(fc(rf(sys.argv[1]))))))) for tf in wfreqs[0:25]: print tf[0], ' - ', tf[1] def rem_stop_words(wordl): return... def frequencies(wordl):

Style #4 constraints Function abstractions f: Input Output No shared state Function composition f º g @cristalopes #style4 name

Style #4 constraints Function abstractions f: Input Output No shared state Function composition f º g Pipeline Style @cristalopes #style4 name

@cristalopes #style5 name STYLE #5

def read_file(path, func):... return func(, normalize) def filter_chars(data, func):... return func(, scan) def normalize(data, func):... return func(,remove_stops) def scan(data, func):... return func(, frequencies) # Main w_freqs=read_file(sys.argv[1], filter_chars) for tf in w_freqs[0:25]: print tf[0], ' - ', tf[1] def remove_stops(data, func):... return func(, sort) Etc.

Style #5 constraints Functions take one additional parameter, f called at the end given what would normally be the return value plus the next function @cristalopes #style5 name

Kick forward Style @cristalopes #style5 name Style #5 constraints Functions take one additional parameter, f called at the end given what would normally be the return value plus the next function

@cristalopes #style6 name STYLE #6

class TFExercise(): def info(self): class DataStorageManager(TFExercise): def is_stop_word(self, word): def info(self): class WordFreqManager(TFExercise): def inc_count(self, word): def sorted(self): def info(self): class WordFreqController(TFExercise): def words(self): def run(self): def info(self): class StopWordManager(TFExercise): # Main WordFreqController(sys.argv[1]).run()

Style #6 constraints Things, things and more things! Capsules of data and procedures Data is never accessed directly Capsules can reappropriate procedures from other capsules @cristalopes #style6 name

Kingdom of Nouns Style @cristalopes #style6 name Style #6 constraints Things, things and more things! Capsules of data and procedures Data is never accessed directly Capsules can reappropriate procedures from other capsules

@cristalopes #style7 name STYLE #7

class DataStorageManager(): def dispatch(self, message): def dispatch(self, message): class WordFrequencyController(): def dispatch(self, message): class StopWordManager(): def dispatch(self, message): class WordFrequencyManager(): # Main wfcntrl = WordFrequencyController() wfcntrl.dispatch([ init,sys.argv[1]]) wfcntrl.dispatch([ run ])

Style #7 constraints (Similar to #6) Capsules receive messages via single receiving procedure @cristalopes #style7 name

Style #7 constraints (Similar to #6) Capsules receive messages via single receiving procedure Letterbox Style @cristalopes #style7 name

@cristalopes #style8 name STYLE #8

# Main splits = map(split_words, partition(read_file(sys.argv[1]), 200)) splits.insert(0, []) word_freqs = sort(reduce(count_words, splits)) for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1]

def split_words(data_str) """ Takes a string (many lines), filters, normalizes to lower case, scans for words, and filters the stop words. Returns a list of pairs (word, 1), so [(w1, 1), (w2, 1),..., (wn, 1)] """... result = [] words = _rem_stop_words(_scan(_normalize(_filter(data_str)))) for w in words: result.append((w, 1)) return result

def count_words(pairs_list_1, pairs_list_2) """ Takes two lists of pairs of the form [(w1, 1),...] and returns a list of pairs [(w1, frequency),...], where frequency is the sum of all occurrences """ mapping = dict((k, v) for k, v in pairs_list_1) for p in pairs_list_2: if p[0] in mapping: mapping[p[0]] += p[1] else: mapping[p[0]] = 1 return mapping.items()

Style #8 constraints Two key abstractions: map(f, chunks) and reduce(g, results) @cristalopes #style8 name

Style #8 constraints Two key abstractions: map(f, chunks) and reduce(g, results) Map-Reduce Style @cristalopes #style8 name

@cristalopes #style9 name STYLE #9

# Main connection = sqlite3.connect(':memory:') create_db_schema(connection) load_file_into_database(sys.argv[1], connection) # Now, let's query c = connection.cursor() c.execute("select value, COUNT(*) as C FROM words GROUP BY value ORDER BY C DESC") for i in range(25): row = c.fetchone() if row!= None: print row[0] + ' - ' + str(row[1]) connection.close()

def create_db_schema(connection): c = connection.cursor() c.execute('''create TABLE documents(id PRIMARY KEY AUTOINCREMENT, name)''' c.execute('''create TABLE words(id, doc_id, value)''') c.execute('''create TABLE characters(id, word_id, value)''') connection.commit() c.close()

# Now let's add data to the database # Add the document itself to the database c = connection.cursor() c.execute("insert INTO documents (name) VALUES (?)", (path_to_f c.execute("select id from documents WHERE name=?", (path_to_fil doc_id = c.fetchone()[0] # Add the words to the database c.execute("select MAX(id) FROM words") row = c.fetchone() word_id = row[0] if word_id == None: word_id = 0 for w in words: c.execute("insert INTO words VALUES (?,?,?)", (word_id, d # Add the characters to the database char_id = 0 for char in w: c.execute("insert INTO characters VALUES (?,?,?)", (c char_id += 1 word_id += 1 connection.commit() c.close()

Style #9 constraints Entities and relations between them Query engine Declarative queries @cristalopes #style9 name

Style #9 constraints Entities and relations between them Query engine Declarative queries Persistent Tables Style @cristalopes #style9 name

Take Home Many ways of solving problems Know them, assess them What are you trying to optimize? Constraints are important for communication Make them explicit Don t be hostage of one way of doing things

@cristalopes