CS 307: UNIX PROGRAMMING ENVIRONMENT WORKING WITH FILES AND COLLECTIONS OF FILES

Similar documents
7. Archiving and compressing 7.1 Introduction

CS Unix Tools. Fall 2010 Lecture 8. Hussam Abu-Libdeh based on slides by David Slater. September 24, 2010

Archives. Gather and compress Campus-Booster ID : **XXXXX. Copyright SUPINFO. All rights reserved

User Commands GZIP ( 1 )

Lab #1 Installing a System Due Friday, September 6, 2002

Utilities. September 8, 2015

Introduction to Linux Organizing Files

CS Unix Tools & Scripting

File: PLT File Format Libraries

Handling Ordinary Files

File: Racket File Format Libraries

Unzip command in unix

Package Management System

Lecture 3. Essential skills for bioinformatics: Unix/Linux

CS 307: UNIX PROGRAMMING ENVIRONMENT FIND COMMAND

Lecture 5. Additional useful commands. COP 3353 Introduction to UNIX

1 Introduction Creating tar archives Extracting tar archives Creating tarballs Extracting tarballs...

Fall Lecture 5. Operating Systems: Configuration & Use CIS345. The Linux Utilities. Mostafa Z. Ali.

File: Racket File Format Libraries

Managing file systems 1

The build2 Toolchain Installation and Upgrade

Where can UNIX be used? Getting to the terminal. Where are you? most important/useful commands & examples. Real Unix computers

Introduction to the shell Part II

Ubuntu unzip tar file

CSE 390a Lecture 6. bash scripting continued; remote X windows; unix tidbits

1 Very Short Linux Manual

Session: Shell Programming Topic: Additional Commands

Unix unzip zip compress uncompress zip zip zip zip Extracting zip Unzip ZIP Unix Unix zip extracting ZIP zip zip unzip zip unzip zip Unix zipped

Computer Architecture Lab 1 (Starting with Linux)

GNU gzip. The data compression program for Gzip version January by Jean-loup Gailly

File: Racket File Format Libraries

gzip The data compression program for Gzip Version April 2007 by Jean-loup Gailly

Unzip linux tgz. Search

Using the Device File Systems, Directories, and Files

Computer Systems and Architecture

Outline. gzip and gunzip data compression archiving files and pipes in Unix. format conversions encrypting text

CSE 390a Lecture 6. bash scripting continued; remote X windows; unix tidbits

4PSA Client Backup User's Guide. for Plesk and newer versions

Week 2 Lecture 3. Unix

Introduction Variables Helper commands Control Flow Constructs Basic Plumbing. Bash Scripting. Alessandro Barenghi

Linux II and III. Douglas Scofield. Crea-ng directories and files 18/01/14. Evolu5onary Biology Centre, Uppsala University

Introduction to UNIX command-line

Package management rpm Package management with yum The tar tool

Introduction to Unix: Fundamental Commands

Basic Shell Commands

The KBackup Handbook. Martin Koller

### build an Archive::Extract object ### my $ae = Archive::Extract->new( archive => 'foo.tgz' );

Introduction to Linux

File Checksums in Python: The Hard Way

SYNOPSIS lbunzip2 bunzip2 [ n WTHRS] [ k c t] [ z] [ f] [ s] [ u] [ v] [ S] [FILE... ]

First of all, these notes will cover only a small subset of the available commands and utilities, and will cover most of those in a shallow fashion.

UFTP STANDALONE CLIENT

GNU CPIO September by Robert Carleton and Sergey Poznyakoff

Unix - Basics Course on Unix and Genomic Data Prague, January 2017

computer user has lost data at some point, perhaps because of a corrupted file or accidental

Unix L555. Dept. of Linguistics, Indiana University Fall Unix. Unix. Directories. Files. Useful Commands. Permissions. tar.

Linux unzip command line

UNIX Quick Reference

Essential Skills for Bioinformatics: Unix/Linux

STUDY OF VARIOUS DATA COMPRESSION TOOLS

Useful Unix Commands Cheat Sheet

Linux Refresher (1) 310/ Fourth Workshop on Distributed Laboratory Instrumentation Systems (30 October - 24 November 2006)

Initial Bootloader > Flash Drive. Warning. If not used carefully this process can be dangerous

backupchecker Documentation

CS Fundamentals of Programming II Fall Very Basic UNIX

User Commands tar ( 1 )

User Manual. version Copyright 2006 Paloş & Sons LTD. Copyright 2006 Paloş & Sons LTD. All rights reserved! Page 1/26

Chapter-3. Introduction to Unix: Fundamental Commands

First of all, these notes will cover only a small subset of the available commands and utilities, and will cover most of those in a shallow fashion.

Practical Knowledge Transfering, moving and exporting files Martin Dahlö

Unix basics exercise MBV-INFX410

Linux Software Management. Linux System Administration COMP2018 Summer 2017

Using Linux as a Virtual Machine

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Programming Standards: You must conform to good programming/documentation standards. Some specifics:


Computer Systems and Architecture

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

GNU sharutils, version

Version Control with Git

Prerequisites: General computing knowledge and experience. No prior knowledge with Linux is required. Supported Distributions:

Unix Introduction. Part 3

FREEENGINEER.ORG. 1 of 6 11/5/15 8:31 PM. Learn UNIX in 10 minutes. Version 1.3. Preface

CSc33200: Operating Systems, CS-CCNY, Fall 2003 Jinzhong Niu September 26, Nachos Overview


NAME optipng optimize Portable Network Graphics files. SYNOPSIS optipng [? h help] optipng [options...] files...

Introduction To Linux. Rob Thomas - ACRC

Read the relevant material in Sobell! If you want to follow along with the examples that follow, and you do, open a Linux terminal.

The Unix Shell. Pipes and Filters

Unzip zip files command line

MIGRATORY COMPRESSION Coarse-grained Data Reordering to Improve Compressibility

AutoArchive. Release 1.4.1

Running Programs in UNIX 1 / 30

Basic Linux Commands. Srihari Kalgi M.Tech, CSE (KReSIT), IIT Bombay. May 5, 2009

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011

File system management commands 1

IT 341: Introduction to System Administration. Notes for Project #8: Backing Up Files with rsync

CSCI2467: Systems Programming Concepts

Final Exam Solutions

Transcription:

CS 307: UNIX PROGRAMMING ENVIRONMENT WORKING WITH FILES AND COLLECTIONS OF FILES Prof. Michael J. Reale Fall 2014

Credit Where Credit Is Due Prof. Nick Merante s notes: http://web.cs.sunyit.edu/~merantn/cs307/ Indiana University Tutorial on Tar: https://kb.iu.edu/d/acfi

File Compression

File Compression There are three main file compression utilities: gzip Standard Unix compression algorithm bzip2 xz Slower but better compression Very slow but best compression

gzip gzip command Unix standard for compression Uses Lempel-Ziv coding (LZ77) Compresses one or more files gzip mightybigfile gzip file1 file2 file3 Compressed file still has the same permissions, access times, etc. Original file is replaced with compressed file adds.gz to extension E.g., gzip File1 File1 is gone; replaced with File1.gz

gzip Compression Quality You can also specify how good you want the compression to be -1 --fast Fastest, but worst compression -9 --best Slowest, but optimal compression (at least with the approach used)

gunzip gunzip command Decompresses (restores) one or more files E.g., gunzip file1.gz Note: ignores files without.gz or.tgz extension (suffix) Can override suffix with S option E.g., gunzip -S waffle test2.waffle Compressed file replaced with original (decompressed) file E.g., gunzip file1.gz file.gz is replaced with file1

gzip/gunzip: Leaving Files Intact By default, the original file is replaced with the compressed file (and vice versa with decompression) To keep the existing files, use c option to write to STDOUT gzip c test > test.gz Compresses test and writes results to test.gz File test still there gunzip -c test.gz Decompresses test.gz and writes to terminal (STDOUT) File test.gz still there gunzip -c test.gz > newtest Decompresses test.gz and write it to newtest File test.gz still there

gzip/gunzip as Filters gzip/gunzip can read data from STDIN and write to STDOUT if no files specified (CNT=1; while [ $CNT -lt 1000 ]; do CNT=`expr $CNT + 1`; /usr/games/fortune; done) gzip > test.gz Prints a thousand fortunes, pipes them to gzip, writes compressed data to test.gz (which didn t exist before) gunzip can decompress something right to the terminal: cat test.gz gunzip gzip, on the other hard, will NOT write to the terminal UNLESS you use the f (force) option Not a great idea anyway, but it s good to know

gzcat gzcat command Decompresses file and then prints it to terminal (a la cat) Same as doing gunzip c

bzip2/bunzip2 bzip2/bunzip2 commands Better compression but will take longer Uses Burrows-Wheeler block sorting text compression algorithm and Huffman coding Very similar in options and usage as gzip (but not identical) Uses.bz2,.bz,.tbz2, or.tbz extensions Can read from STDIN and write to STDOUT (if no filenames specified or using c option) bzcat same function as gzcat

xz/unxz xz/unxz commands Very slow but gives the best compression results Again, very similar options and usage as gzip and bzip2 Also has the same STDIN/STDOUT behavior Uses.xz format Can also handle legacy format.lzma Also has xzcat

Tar

Introduction So far, we re able to compress single, regular files What if we want to compress multiple files and/or a whole directory as one big file? Have to somehow turn all the files (or the contents of the directory) into one file

Tape Drives In days of yore (and to a MUCH lesser extent even now), tape drives were used to store/archive data VERY slow, but high capacity The tar utility was originally written to archive data to tape drives Now, we use it to archive files/directories

Tar: Tape Archiver tar command Concatenates file contents (each separated by header information) Preserves owner, permissions, timestamp information, etc. -f -v Specify tar file (either as input for output) Verbose output lists files names it is reading from/writing to the archive

Creating an Archive To create an archive from files and directories, use the cf option: tar cf myarchive.tar file1 file2 file3 Puts file1, file2, and file3 into the archive myarchive.tar

Unpacking an Archive To unpack the archive, use the xf option tar xf myarchive.tar

Listing the Contents To list the contents of a tar file without unpacking it, use the tf option tar -tf myarchive.tar

Compressing AND Tarring If you re using GNU tar, you can compress and tar files at the same time: -z Use gzip compression -y Use bzip2 compression Examples: tar cvzf myarchive.tgz file1 file2 file3 Creates compressed archive with gzip tar xvyf myarchive.tbz2 Unpacks bzip2 compressed archive.tgz =.tar.gz and.tbz2 =.tar.bz2

Assorted Useful Utilities

srm: Secure Removal srm command Securely deletes files Overwrites file data and then deletes file (unlinks hard link) May not be available on all systems (or may be named something else)

split split command Allows you to split a file into pieces Useful when you have a VERY large.tar file Syntax: split -b byte_count[k k M m G g] [-a suffix_length] [file [prefix]] -a suffix length; determines how many letters to use for each part Example: split -b 650m -a 1 big_tarball.tar Default prefix: x Becomes xa, xb, xc,

Message Digests Say you want to give a file to a friend; when they download it, how do they know for sure that the file data is the original file data? Could have been altered or corrupted (either unintentionally or intentionally) One way to handle this: Generate a message digest for the file Your friend downloads the file and the digest They generate a digest from the file they received If their digest matches your digest, life is good Message digest = kind of like a fingerprint for the data Believed to be computationally infeasible to have two different files generate the same digest

Generating Message Digests Depending on what kind of digest you want to generate, the name of the command is different Under FreeBSD, command is usually just the name: md5 generates MD5 digest sha1 generates SHA-1 digest Under Linux, name + sum md5sum sha1sum sha256sum MD5 is completely crackable (and with SHA-1 it s possible), so it s recommended you use SHA-256 (or higher) if security is a concern