Public-Service Announcement

Similar documents
Public-Service Announcement

Lecture #39. Initial grading run Wednesday. GUI files up soon. Today: Dynamic programming and memoization.

Basics of Git GitHub

Git - A brief overview

Git. Ľubomír Prda. IT4Innovations.

What is version control? (discuss) Who has used version control? Favorite VCS? Uses of version control (read)

I m an egotistical bastard, and I name all my projects after myself. First Linux, now git. Linus Torvalds, creator of Linux and Git

Decentralized Version Control Systems

Git. (Why not CVS?... because Git.) Karel Zak Florian Festi Bart Trojanowski December 20, 2007

Software Revision Control for MASS. Git Basics, Best Practices

AIS Grid School 2015

Overview. 1. Install git and create a Github account 2. What is git? 3. How does git work? 4. What is GitHub? 5. Quick example using git and GitHub

1. Which of these Git client commands creates a copy of the repository and a working directory in the client s workspace. (Choose one.

GETTING STARTED WITH. Michael Lessard Senior Solutions Architect June 2017

Version Control: Gitting Started

Version Control with Git

August 22, New Views on your History with git replace. Christian Couder

Git. A Distributed Version Control System. Carlos García Campos

Version Control Systems

An introduction to git

Version control with Git.

Using Git For Development. Shantanu Pavgi, UAB IT Research Computing

Git for Version Control

A Practical Introduction to Version Control Systems

GIT VERSION CONTROL TUTORIAL. William Wu 2014 October 7

Lab Exercise Git: A distributed version control system

Algorithm Engineering

Git. CSCI 5828: Foundations of Software Engineering Lecture 02a 08/27/2015

Using GitHub to Share with SparkFun a

Fundamentals of Git 1

Version Control. Second level Third level Fourth level Fifth level. - Software Development Project. January 17, 2018

EECS 470 Lab 4. Version Control System. Friday, 31 st January, 2014

INET

Push up your code next generation version control with (E)Git

CS 390 Software Engineering Lecture 4 - Git

Integrity of messages

Announcements. Last modified: Thu Nov 29 16:15: CS61B: Lecture #40 1

CS 320 Introduction to Software Engineering Spring February 06, 2017

Having Fun with Social Coding. Sean Handley. February 25, 2010

Version Control with GIT

CS 3410 Ch 7 Recursion

Best Practices. Joaquim Rocha IT-DSS-TD

Versioning with git. Moritz August Git/Bash/Python-Course for MPE. Moritz August Versioning with Git

Notes for Lecture 21. From One-Time Signatures to Fully Secure Signatures

Public-Service Announcement I

Practice Problems for the Final

John DeDourek Professor Emeritus Faculty of Computer Science University of New Brunswick GIT

What is git? Distributed Version Control System (VCS); Created by Linus Torvalds, to help with Linux development;

CS61B Lecture #32. Last modified: Sun Nov 5 19:32: CS61B: Lecture #32 1

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Module 02 Lecture - 45 Memoization

Git Introduction CS 400. February 11, 2018

CS 390 Software Engineering Lecture 3 Configuration Management

COMP 250. Lecture 27. hashing. Nov. 10, 2017

Introduction to Scientific Computing

Search and Optimization

GIT A Stupid Content Tracker

CS61B Lecture #2. Public Service Announcements:

Lab session 1 Git & Github

ECE608 - Chapter 16 answers

Human Version Control

Debugging and Version control

Version Control. Version Control. Human Version Control. Version Control Systems

Cryptographic Hash Functions

CSCI S-Q Lecture #12 7/29/98 Data Structures and I/O

F17 Modern Version Control with Git. Aaron Perley

COMP 182: Algorithmic Thinking Dynamic Programming


CS 1110: Introduction to Computing Using Python Loop Invariants

CS408 Cryptography & Internet Security

Tutorial: Getting Started with Git. Introduction to version control Benefits of using Git Basic commands Workflow

About SJTUG. SJTU *nix User Group SJTU Joyful Techie User Group

Using Git and GitLab: Your first steps. Maurício

Cyber Security Applied Cryptography. Dr Chris Willcocks

Introduction to the UNIX command line

P2_L8 - Hashes Page 1

Computer Science Design I Version Control with Git

Winter 2011 Josh Benaloh Brian LaMacchia

Lecture #40: Course Summary

S. Erfani, ECE Dept., University of Windsor Network Security

Overview. CSC 580 Cryptography and Computer Security. Hash Function Basics and Terminology. March 28, Cryptographic Hash Functions (Chapter 11)

Git and GitHub. Dan Wysocki. February 12, Dan Wysocki Git and GitHub February 12, / 48

Code and data management with Git. Department of Human Genetics Center for Human and Clinical Genetics

Lab 08. Command Line and Git

S18 Modern Version Control with Git

Git. Christoph Matthies Software Engineering II WS 2018/19. Enterprise Platform and Integration Concepts group

Version Control. CSC207 Fall 2014

Version Control with Git

Introduction to Version Control

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

MITOCW watch?v=zlohv4xq_ti

Programming with Haiku

Solving Minesweeper Using CSP

Linus Torvalds inventor of Linux wanted a better source control system so he wrote one

CS61BL. Lecture 3: Asymptotic Analysis Trees & Tree Traversals Stacks and Queues Binary Search Trees (and other trees)

Software Project (Lecture 4): Git & Github

A hash function is strongly collision-free if it is computationally infeasible to find different messages M and M such that H(M) = H(M ).

CSE 391 Lecture 9. Version control with Git

And Parallelism. Parallelism in Prolog. OR Parallelism

COMPUTER SCIENCE IN THE NEWS. CS61A Lecture 7 Complexity and Orders of Growth TODAY PROBLEM SOLVING: ALGORITHMS COMPARISON OF ALGORITHMS 7/10/2012

Common Git Commands. Git Crash Course. Teon Banek April 7, Teon Banek (TakeLab) Common Git Commands TakeLab 1 / 18

Transcription:

Public-Service Announcement Hi EECS grads, Interested in putting your engineering skills to the test? Join the BERC Hackathon this Friday and Saturday for a chance to win some prizes! Location: Berkeley Institute for Data Science, 190, Doe Library, Berkeley, CA 94720, USA. Time: 5:00 9:00 Calling all scientists, software engineers, policy makers, designers, and entrepreneurs: join us for the 5th annual BERC Cleanweb Hackathon! Teams have 24 hours to create new solutions to challenges in energy, environment, and climate, with a chance to win $2000 in cash prizes. Free food, beverages, and swag will be provided throughout the weekend. Details and Registration: https://our-hacks.eventbrite.com Prizes: - 1st Place: $1000-2nd Place: $750 - People s Choice: $250 Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 1

Lecture #35 Today Dynamic programming and memoization. Anatomy of Git. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 2

A puzzle (D. Garcia): Dynamic Programming Start with a list with an even number of non-negative integers. Each player in turn takes either the leftmost number or the rightmost. Idea is to get the largest possible sum. Example: starting with (6, 12, 0, 8), you (as first player) should take the 8. Whatever the second player takes, you also get the 12, for a total of 20. Assuming your opponent plays perfectly (i.e., to get as much as possible), how can you maximize your sum? Can solve this with exhaustive game-tree search. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 3

Recursion makes it easy, again: Obvious Program int bestsum(int[] V) { int total, i, N = V.length; for (i = 0, total = 0; i < N; i += 1) total += V[i]; return bestsum(v, 0, N-1, total); /** The largest sum obtainable by the first player in the choosing * game on the list V[LEFT.. RIGHT], assuming that TOTAL is the * sum of all the elements in V[LEFT.. RIGHT]. */ int bestsum(int[] V, int left, int right, int total) { if (left > right) return 0; else { int L = total - bestsum(v, left+1, right, total-v[left]); int R = total - bestsum(v, left, right-1, total-v[right]); return Math.max(L, R); Time cost is C(0) = 1, C(N) = 2C(N 1); so C(N) Θ(2 N ) Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 4

Still Another Idea from CS61A The problem is that we are recomputing intermediate results many times. Solution: memoize the intermediate results. Here, we pass in an N N array (N = V.length) of memoized results, initialized to -1. int bestsum(int[] V, int left, int right, int total, int[][] memo) { if (left > right) return 0; else if (memo[left][right] == -1) { int L = total - bestsum(v, left+1, right, total-v[left], memo); int R = total - bestsum(v, left, right-1, total-v[right], memo); memo[left][right] = Math.max(L, R); return memo[left][right]; Now the number of recursive calls to bestsum must be O(N 2 ), for N = the length of V, an enormous improvement from Θ(2 N )! Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 5

Iterative Version I prefer the recursive version, but the usual presentation of this idea known as dynamic programming is iterative: int bestsum(int[] V) { int[][] memo = new int[v.length][v.length]; int[][] total = new int[v.length][v.length]; for (int i = 0; i < V.length; i += 1) memo[i][i] = total[i][i] = V[i]; for (int k = 1; k < V.length; k += 1) for (int i = 0; i < V.length-k-1; i += 1) { total[i][i+k] = V[i] + total[i+1][i+k]; int L = total[i][i+k] - memo[i+1][i+k]; int R = total[i][i+k] - memo[i][i+k-1]; memo[i][i+k] = Math.max(L, R); return memo[0][v.length-1]; That is, we figure out ahead of time the order in which the memoized version will fill in memo, and write an explicit loop. Save the time needed to check whether result exists. But I say, why bother unless it s necessary to save space? Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 6

Longest Common Subsequence Problem: Find length of the longest string that is a subsequence of each of two other strings. Example: Longest common subsequence of sally sells sea shells by the seashore and sarah sold salt sellers at the salt mines is sa sl sa sells the sae (length 23) Similarity testing, for example. Obvious recursive algorithm: /** Length of longest common subsequence of S0[0..k0-1] * and S1[0..k1-1] (pseudo Java) */ static int lls(string S0, int k0, String S1, int k1) { if (k0 == 0 k1 == 0) return 0; if (S0[k0-1] == S1[k1-1]) return 1 + lls(s0, k0-1, S1, k1-1); else return Math.max(lls(S0, k0-1, S1, k1), lls(s0, k0, S1, k1-1); Exponential, but obviously memoizable. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 7

Memoized Longest Common Subsequence /** Length of longest common subsequence of S0[0..k0-1] * and S1[0..k1-1] (pseudo Java) */ static int lls(string S0, int k0, String S1, int k1) { int[][] memo = new int[k0+1][k1+1]; for (int[] row : memo) Arrays.fill(row, -1); return lls(s0, k0, S1, k1, memo); private static int lls(string S0, int k0, String S1, int k1, int[][] memo) { if (k0 == 0 k1 == 0) return 0; if (memo[k0][k1] == -1) { if (S0[k0-1] == S1[k1-1]) memo[k0][k1] = 1 + lls(s0, k0-1, S1, k1-1, memo); else memo[k0][k1] = Math.max(lls(S0, k0-1, S1, k1, memo), lls(s0, k0, S1, k1-1, memo)); return memo[k0][k1]; Q: How fast will the memoized version be? Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 8

Memoized Longest Common Subsequence /** Length of longest common subsequence of S0[0..k0-1] * and S1[0..k1-1] (pseudo Java) */ static int lls(string S0, int k0, String S1, int k1) { int[][] memo = new int[k0+1][k1+1]; for (int[] row : memo) Arrays.fill(row, -1); return lls(s0, k0, S1, k1, memo); private static int lls(string S0, int k0, String S1, int k1, int[][] memo) { if (k0 == 0 k1 == 0) return 0; if (memo[k0][k1] == -1) { if (S0[k0-1] == S1[k1-1]) memo[k0][k1] = 1 + lls(s0, k0-1, S1, k1-1, memo); else memo[k0][k1] = Math.max(lls(S0, k0-1, S1, k1, memo), lls(s0, k0, S1, k1-1, memo)); return memo[k0][k1]; Q: How fast will the memoized version be? Θ(k 0 k 1 ) Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 9

Git: A Case Study in System and Data-Structure Design Git is a distributed version-control system, apparently the most popular of these currently. Conceptually, it stores snapshots (versions) of the files and directory structure of a project, keeping track of their relationships, authors, dates, and log messages. It is distributed, in that there can be many copies of a given repository, each supporting indepenent development, with machinery to transmit and reconcile versions between repositories. Its operation is extremely fast (as these things go). Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 10

A Little History Developed by Linus Torvalds and others in the Linux community when the developer of their previous, propietary VCS (Bitkeeper) withdrew the free version. Initial implementation effort seems to have taken about 2 3 months, in time for the 2.6.12 Linux kernel release in June, 2005. As for the name, according to Wikipedia, Torvalds has quipped about the name Git, which is British English slang meaning unpleasant person. Torvalds said: I m an egotistical bastard, and I name all my projects after myself. First Linux, now git. The man page describes Git as the stupid content tracker. Initially, was a collection of basic primitives (now called plumbing ) that could be scripted to provide desired functionality. Then, higher-level commands ( porcelain ) built on top of these to provide a convenient user interface. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 11

Major User-Level Features (I) Abstraction is of a graph of versions or snapshots (called commits) of a complete project. The graph structure reflects ancestory: which versions came from which. Each commit contains A directory tree of files (like a Unix directory). Information about who committed and when. Log message. Pointersto commit (or commits, if therewasamerge)from which the commit was derived. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 12

Conceptual Structure Main internal components consist of four types of object: Blobs: basically hold contents of files. Trees: directory structures of files. Commits: Contain references to trees and additional information (committer, date, log message). Tags: References to commits or other objects, with additional information, intended to identify releases, other important versions, or various useful information. (Won t mention further today). Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 13

Commits, Trees, Files Version 1 Version 2 Version 3 Commits Trees Dashed lines link objects that are the same D F G D F G D F G F1 G1 F2 G1 F2 G1 H I H I H H1 I1 H1 I1 H1 Blobs (files) Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 14

Version Histories in Two Repositories Repository 1 Repository 2 Repository 2 after pushing V6 to it V1 V1 V1 V2 V2 V2 V3 V5 V3 V9 V3 V5 V9 V4 V6 V4 V4 V6 V7 V8 V8 Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 15

Major User-Level Features (II) Each commit has a name that uniquely identifies it to all versions. Repositories can transmit collections of versions to each other. Transmitting a commit from repository A to repository B requires only the transmission of those objects (files or directory trees) that B does not yet have (allowing speedy updating of repositories). Repositories maintain named branches, which are simply identifiers of particular commits that are updated to keep track of the most recent commits in various lines of development. Likewise, tags are essentially named pointers to particular commits. Differ from branches in that they are not usually changed. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 16

Internals Each Git repository is contained in a directory. Repository may either be bare (just a collection of objects and metadata), or may be included as part of a working directory. The data of the repository is stored in various objects corresponding to files (or other leaf content), trees, and commits. To save space, data in files is compressed. Git can garbage-collect the objects from time to time to save additional space. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 17

The Pointer Problem Objects in Git are files. How should we represent pointers between them? Want to be able to transmit objects from one repositoryto another with different contents. How do you transmit the pointers? Only want to transfer those objects that are missing in the target repository. How do we know which those are? Could use a counter in each repository to give each object there a unique name. But how can that work consistently for two independent repositories? Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 18

Content-Addressable File System Could use some way of naming objects that is universal. We use the names, then, as pointers. Solves the Which objects don t you have? problem in an obvious way. Conceptually, what is invariant about an object, regardless of repository, is its contents. But can t use the contents as the name for obvious reasons. Idea: Use a hash of the contents as the address. Problem: That doesn t work! Brilliant Idea: Use it anyway!! Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 19

How A Broken Idea Can Work The idea is to use a hash function that is so unlikely to have a collision that we can ignore that possibility. Cryptographic Hash Functions have relevant property. Such a function, f, is designed to withstand cryptoanalytic attacks. In particular, should have Pre-image resistance: given h = f(m), should be computationally infeasible to find such a message m. Second pre-image resistance: given message m 1, should be infeasible to find m 2 m 1 such that f(m 1 ) = f(m 2 ). Collision resistance: should be difficult to find any two messages m 1 m 2 such that f(m 1 ) = f(m 2 ). With these properties, scheme of using hash of contents as name is extremely unlikely to fail, even when system is used maliciously. Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 20

SHA1 Git uses SHA1 (Secure Hash Function 1). Can play around with this using the hashlib module in Python3. All object names in Git are therefore 160-bit hash codes of contents, in hex. E.g. arecentcommitinthesharedcs61brepositorycouldbefetched (if needed) with git checkout 4641d45114656f2fd90b571ebf76649298060291 Last modified: Tue Nov 14 17:01:55 2017 CS61B: Lecture #35 21