Version Control Systems
Version Control In the 2 nd edition of Pro Git, version control is described as a system that records changes to a file or set of files over time so that you can recall specific versions later Why bother? Members of a software development teams need to: have access to the group source code (file sharing) work at the same time on the same files (concurrent editing) keep track of different versions of the same file (history)
Concurrent Editing Why is concurrent editing difficult/perilous? A normal file server (eg. NFS) can provide file sharing, but would keep only one version of each file (the most recent one). diagram by Brian W. Fitzpatrick, C. Michael Pilato, Copyright 2000, 2001, 2002, 2003, 2004 CollabNet, Inc.
Types of Version Control Systems Version Control Systems come in three broad categories Local Version Control System Centralized Version Control Systems Distributed Version Control Systems
Local Version Control System People save files on their computers with slightly different names e.g.resume_v1.docx, resume20150304.docx Software like RCS had a simple database that kept track of all file changes Image from Pro Git 2 nd Edition
Centralized Version Control System A Centralized Version Control System is a special file server, designed to support concurrent editing and to store file history information. Software like CVS and Subversion have a single server that contains all the versioned files, and a number of clients that check out files from that central place. Was the dominant model for many years Usually implement one of two types of mechanisms: Lock-Modify-Unlock Copy-Modify-Merge Has a weakness, if the central repository (i.e. the server) gets corrupted, the history of all the changes could be lost
Centralized VCS Image from Pro Git 2 nd Edition
Lock-Modify-Unlock A simple mechanism to support concurrent editing: Disadvantages of this scheme: delays: locking a file prevents concurrent editing administrative overhead: if a user forgets to release the files they hasve locked, an administrator has to manually remove the lock before another user can edit the files. diagram by Brian W. Fitzpatrick, C. Michael Pilato, Copyright 2000, 2001, 2002, 2003, 2004 CollabNet, Inc.
Another mechanism: Copy-Modify-Merge diagram by Brian W. Fitzpatrick, C. Michael Pilato, Copyright 2000, 2001, 2002, 2003, 2004 CollabNet, Inc.
Copy-Modify-Merge When merging, two types of changes to a file can occur: changes that do not overlap: in this case merging is trivial - just take the sum of changes changes that overlap: in this case there is a conflict and merging can be difficult - users must communicate to decide which changes to propagate to the new version. Merging conflict ed files is a manual process by the user (No AI available yet to decide which changes to take). NOTE: The amount of time it takes to resolve conflicts is far less than the time lost by locking a file in a Lock-Modify-Unlock system.
Distributed Version Control System Software like GIT and Mercurial can have a server that contains all the versioned files (though not necessary), but each user has a full mirror of the repository locally. This eliminates the possibility of losing the repository and its history Becoming more and more popular for version control versus the centralized model For this course we will be using GIT In the mid-2000s Linus Torvalds (the creator or Linux) wanted to create a better version control system, based on lessons learned with previous VCSs (Bitkeeper specifically) Created a faster, simple, powerful and efficient tool Image from Pro Git 2 nd Edition
Distributed VCS Image from Pro Git 2 nd Edition
Centralized VCS Tracking Centralized systems store information as a list of file-based changes (delta Δ implies change) The file can be built to any point in time by applying the changes that occurred to the file
Distributed VCS Tracking Distributed systems store snapshots of your files at points in time
GIT File Integrity All files in a GIT repository is checked sum Effectively every file is hashed as SHA-1 (which creates a 40-character hexadecimal string which is stored for the file If any file changes (whether a content change, information lost in transit or data corruption), it will have a very different hash and GIT can capture it This check summing is built in at the lowest level and ensures that no changes can be lost
The 3 States of GIT Git has three main states that your files can reside in: committed, modified, and staged Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.
The 3 States of GIT
Git GUI Setup (Welcome) click Next
Git GUI Setup (License) click Next
Git GUI Setup (Destination) click Next
Git GUI Setup (Components) click Next NOTE: you can optionally include Additional icons in either you Quick Launch or on your Desktop or both
Git GUI Setup (Start Menu) click Next
Git GUI Setup (PATH) click Next
Git GUI Setup (SSH) click Next
Git GUI Setup (Line endings) click Next
Git GUI Setup (terminal emulator) click Next
Git GUI Setup (tweaks) click Next
Git GUI Setup Complete click Finish
Git Bash Git Bash is the console in which you run Git Commands Go to: Start -> All Programs -> Git -> Git Bash
Git Bash Commands - config To get and set variables that modify how Git looks and operates there is the git config command Whenever you set up Git on your local machine you should set your name and email address. To set your user name use the following command: git config --global user.name "Bill Smith" To set your email address use: git config --global user.email "bill.smith@dcmail.ca" To see all of the configuration for the repository you are in run: git config -l
Git Bash - Navigation To change the directory to where you want to clone your group s remote repository cd with the file path. NOTE: letter-drives must be inside forward slashes / /: E.g. cd /c/users/userid/documents/webd3201/groupxx Git Bash should show your userid@computer name, followed by the directory you cd ed to
Git Bash Clone an Existing Remote Repository For a Remote Repository Change directory in Git Bash to where you want to have your local repository run the command: git clone user_id@opentech2.durhamcollege.org:/var/www/html/git/groupxx.git Where user_id is your user_id on the remote server and groupxx.git should be your group number (leading-zero for single digit groups, i.e. group02 instead of group2) You should be prompted for your opentech2 password
Git Bash Commands - init To create a repository (not necessary in this class, it is already done for you) git init To create an empty repository in the current directory git init --bare
Git Bash Commands - add To add a specific file named file.ext git add file.ext The above command can become tedious (if you have several new files to add). To add ALL untracked files in your local repository to the remote one use: git add A All tracked files (i.e. those that have been add ed at some point will be staged to be commited)
Git Bash Commands - commit To commit a snapshot of all changes in the working directory git commit -a Note: you will be prompted to add a message as part of the commit. To add the message pre-emptively git commit -am "DESCRIBE YOUR CHANGES" Commits all unstaged changes to your local repository, with the given commit message.
Git Bash Commands - pull To retrieves any changes on the remote repository, and merge them into your local repository. git pull Note: you should does this before any commits, to ensure your code is up to date.
Git Bash Commands - push To submit any committed changes from your local repository, and merge them onto the remote repository. git push This is how you should share work with your group mates. You push your changed files onto the repository, they then pull the modified files down into their local repositories.
Git Bash Commands - status To show you the current status of your local repository. git status For a local repository with: an index.php file that has been committed before, but that you have change; with a lab1.php file that you have add ed but not committed yet; and, a lab3.php file that is in the repository folder but not add yet. The git status would return. On branch master Your branch is up-to-date with 'origin/master'. Changes to be committed: (use "git reset HEAD <file>..." to unstage) new file: lab1.php Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: index.php Untracked files: (use "git add <file>..." to include in what will be committed) lab2.php
Git Bash Commands status (cont d) For a short display you can use the s flag. git status -s For same status presented in the last slide. The output would be: M index.php A lab1.php?? lab2.php M modified (from the last commited version A added to repository, but not yet commited?? file is untracked (i.e. not part of the repository)
Git Resources Pro Git 2 nd Edition by Scott Chacon and Ben Straub https://git-scm.com/book/en/v2 Downloading Git GUI for Windows https://git-scm.com/download/win