Mineração de Dados Aplicada

Size: px
Start display at page:

Download "Mineração de Dados Aplicada"

Transcription

1 Simple but Powerful Text-Processing Commands August, 29 th 2018 DCC ICEx UFMG

2 Unix philosophy Unix philosophy Doug McIlroy (inventor of Unix pipes). In A Quarter-Century of Unix (1994): Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface. 2 / 22

3 Utilities from the 70s The Unix operating system came with several text-processing commands that are still very useful today. Specific, these commands are very efficient. The GNU project, started in 1984, has improved them a great deal (e. g., adding options). The original commands are part of some POSIX standards. 3 / 22

4 Utilities from the 70s The Unix operating system came with several text-processing commands that are still very useful today. Specific, these commands are very efficient. The GNU project, started in 1984, has improved them a great deal (e. g., adding options). The original commands are part of some POSIX standards. Most Free operating systems are POSIX-compliant: GNU/Linux, BSD, illumos, Haiku, etc. Mac OS X is too. Windows is not but Cygwin is a good compatibility layer. 3 / 22

5 The shell The shell interprets every command fired in a terminal or in an executable file, whose first line indicates the shell to use: #!/bin/sh (any POSIX-compliant shell) or #!/bin/bash or #!/bin/dash or #!/bin/zsh, etc. A 2-line shell script #!/bin/sh # Get started with the MDA exercises wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj mv data data-$(date +%m%d) 2> /dev/null 4 / 22

6 Geeks and repetitive tasks 5 / 22

7 Standard I/O POSIX text processing commands: process the input line by line; read, by default, the standard input (the keyboard if not redirected); write, by default, on the standard output (the terminal if not redirected). 6 / 22

8 Standard I/O POSIX text processing commands: process the input line by line; read, by default, the standard input (the keyboard if not redirected); write, by default, on the standard output (the terminal if not redirected). </> redirects the standard I/O from/to a file. A pipe binds an output stream to an input stream. It can bear a name (in argument of mkfifo) but most workflows only need the unnamed pipe,. redirects the standard output of the command on the left to the standard input of the command on the right. 6 / 22

9 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj 7 / 22

10 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj wget and tar are two GNU commands. Like all GNU commands: the man command (e. g., man wget) gives their specifications; 7 / 22

11 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj wget and tar are two GNU commands. Like all GNU commands: the man command (e. g., man wget) gives their specifications; the info command (e. g., info wget) often provides more detailed explanations, examples of use, etc.; 7 / 22

12 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj wget and tar are two GNU commands. Like all GNU commands: the man command (e. g., man wget) gives their specifications; the info command (e. g., info wget) often provides more detailed explanations, examples of use, etc.; long options are prefixed with --, short (i. e., one letter) options with - and can be grouped (e. g., -xj); 7 / 22

13 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj wget and tar are two GNU commands. Like all GNU commands: the man command (e. g., man wget) gives their specifications; the info command (e. g., info wget) often provides more detailed explanations, examples of use, etc.; long options are prefixed with --, short (i. e., one letter) options with - and can be grouped (e. g., -xj); options can take (right after) an argument; - means the standard input (/dev/stdin) or the standard output (/dev/stdout); 7 / 22

14 Getting the data $ wget dcc.ufmg.br/~lcerf/data.tar.xz -O - tar -xj wget and tar are two GNU commands. Like all GNU commands: the man command (e. g., man wget) gives their specifications; the info command (e. g., info wget) often provides more detailed explanations, examples of use, etc.; long options are prefixed with --, short (i. e., one letter) options with - and can be grouped (e. g., -xj); options can take (right after) an argument; - means the standard input (/dev/stdin) or the standard output (/dev/stdout); the unnamed pipe,, redirects the standard output of the command on the left to the standard input of the command on the right. 7 / 22

15 Reading a large text file Your favorite text editor (Vim or Emacs?) loads the entire file in main memory, a problem if it weights gigabytes or more. 8 / 22

16 Reading a large text file Your favorite text editor (Vim or Emacs?) loads the entire file in main memory, a problem if it weights gigabytes or more. The solution is named less. It is the viewer for man pages. A few commands inside less: Page-up/down, R (repaint), F (follow), [0-9]+ (scroll that many lines), / (search forwards for a regexp),? (search backwards for a regexp), q (quit). 8 / 22

17 Reading a large text file Your favorite text editor (Vim or Emacs?) loads the entire file in main memory, a problem if it weights gigabytes or more. The solution is named less. It is the viewer for man pages. A few commands inside less: Page-up/down, R (repaint), F (follow), [0-9]+ (scroll that many lines), / (search forwards for a regexp),? (search backwards for a regexp), q (quit). Exercise Find, with less, the IP address of the first Brazilian visitor after the 100 th line of DistroWatch/ /debian. 8 / 22

18 Printing the first/last lines Do not test your scripts on the whole dataset! head outputs the head of a file; tail its tail. A few options: -[0-9]+ tunes the number of lines (10 by default), -n too but a -/+ prefix asks head/tail to display all lines except the provided number of last/first lines, -f follows the appended data (tail only). 9 / 22

19 Printing the first/last lines Do not test your scripts on the whole dataset! head outputs the head of a file; tail its tail. A few options: -[0-9]+ tunes the number of lines (10 by default), -n too but a -/+ prefix asks head/tail to display all lines except the provided number of last/first lines, -f follows the appended data (tail only). Exercise Print the lines 5 to 15 of one file in DistroWatch.com s logs. 9 / 22

20 Shuffling lines The shuf command shuffles the lines of a file. A few options: -n [0-9]+ outputs only the number of lines in argument (uniform random sampling without replacement), -r allows repetitions (uniform random sampling with replacement), -e specifies a line (rather than taking those in a file). 10 / 22

21 Shuffling lines The shuf command shuffles the lines of a file. A few options: -n [0-9]+ outputs only the number of lines in argument (uniform random sampling without replacement), -r allows repetitions (uniform random sampling with replacement), -e specifies a line (rather than taking those in a file). Exercise Sample, in a uniformly random way, 100 visits to the Ubuntu page on April, 28 th / 22

22 Concatenating files The cat command concatenates files and prints them. 11 / 22

23 Concatenating files The cat command concatenates files and prints them. Exercise Concatenate the files related to the visits to the Ubuntu page. 11 / 22

24 Replacing characters Given, in argument, two lists of characters, the tr command outputs its standard input after replacing (aka translating) every character in the first list with the one at the same position in the second list. The last character in the second list is considered repeated up to the size of the first list. - between two characters defines an interval. A few options: -c specifies the first list as the complement of the provided characters, -d deletes (rather than replaces) the characters in the first list, given consecutive characters found in the first list, -s substitutes the first one and deletes the rest (squeeze). 12 / 22

25 Exercise Basic reformatting with tr Choose one of the files in DistroWatch.com s logs and: 1 change its delimiters into spaces; 2 make its country codes lower cased. Notice that the shell processes the command line before executing it: the command line is broken w.r.t. whitespaces, $var is replaced by the content of the shell variable var, * is replaced by any string to match existing file paths, etc. To preserve the literal meaning: a backslash can precede every special character; a (portion of an) argument can be enclosed with single quotes (with double quotes to still let the shell interpret $ and ). 13 / 22

26 Counting lines, words and/or characters The wc command counts the number of lines (option -l), of words (option -w) or of characters (option -m). 14 / 22

27 Counting lines, words and/or characters The wc command counts the number of lines (option -l), of words (option -w) or of characters (option -m). Exercise In DistroWatch.com s logs, what are the numbers of visits per day to the Ubuntu page. Is there something special (compare to other distributions)? 14 / 22

28 Selecting fields The cut command selects fields. cut considers that there is an empty field between two subsequent delimiters. -d specifies the delimiter, -f specifies the fields cut must keep (comma-separated numbers or intervals, using -). 15 / 22

29 Selecting fields The cut command selects fields. cut considers that there is an empty field between two subsequent delimiters. -d specifies the delimiter, -f specifies the fields cut must keep (comma-separated numbers or intervals, using -). Exercise From one of the file in DistroWatch.com s logs, get rid of the IP addresses. 15 / 22

30 Pasting The paste command concatenates the lines of the input files in the order they are given. -d specifies the delimiter. 16 / 22

31 Comparing comm compares the lines of two sorted files. It outputs the lines only in the first file (first column), only in the second file (second column), and those in both files (third column). -1, -2 and -3 respectively remove the first, second and third column from the output. 17 / 22

32 Joining join joins the lines of two files. The join fields of both files must be sorted. A few options: -1 NUM sets the join field of the first file, -2 NUM for the second file, -j for both, -a {1, 2} prints unpairable lines from the specified file too, -v {1, 2} only prints unpairable lines from the specified file, -i ignores case, -t CHAR sets the delimiter. 18 / 22

33 Sorting The sort command sorts a text file. The locale settings affect the ordering (you may want LC ALL=C). A few options: -r reverses the ordering, -f ignores case, -n numerically sorts, -c checks if sorted, -k POS1[,POS2] sorts according to the fields in the interval, -t sets the field delimiter, -u removes duplicates, -m merges already sorted files. 19 / 22

34 Sorting The sort command sorts a text file. The locale settings affect the ordering (you may want LC ALL=C). A few options: -r reverses the ordering, -f ignores case, -n numerically sorts, -c checks if sorted, -k POS1[,POS2] sorts according to the fields in the interval, -t sets the field delimiter, -u removes duplicates, -m merges already sorted files. Exercise In DistroWatch.com s logs: 1 How many different countries are there? 2 On the 29th, what are the countries who originated visits to the site but no visit to Ubuntu s page? 19 / 22

35 Reporting or omitting repeated lines The uniq command removes adjacent repeated lines. A few options: -c reports the counts of repetitions, -d only prints repeated lines, -u only prints unique lines, -i ignores case. 20 / 22

36 Reporting or omitting repeated lines The uniq command removes adjacent repeated lines. A few options: -c reports the counts of repetitions, -d only prints repeated lines, -u only prints unique lines, -i ignores case. Exercise In DistroWatch.com s logs, what are the ten countries that originated the highest numbers of accesses to the index page, during the three days. 20 / 22

37 Homework Exercise Exercise for next Wednesday Given a list of keywords and a text, use the commands presented so far to: 1 list each keyword occurring in the text; 2 count the occurrences of each keyword in the text. Reading for next Wednesday Basic Regular Expressions in Wikipedia. 21 / 22

38 License c These slides are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. 22 / 22

Mineração de Dados Aplicada

Mineração de Dados Aplicada sed September, 10 th 2018 DCC ICEx UFMG Outline for the next sessions Today sed; 12/09 awk s basics; 17/09 awk s array; 19/09 A few words about efficiency; 24/09 Oral presentation of your data. 2 / 17

More information

5/8/2012. Exploring Utilities Chapter 5

5/8/2012. Exploring Utilities Chapter 5 Exploring Utilities Chapter 5 Examining the contents of files. Working with the cut and paste feature. Formatting output with the column utility. Searching for lines containing a target string with grep.

More information

Week Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file

Week Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file ULI101 Week 05 Week Overview Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file head and tail commands These commands display

More information

IB047. Unix Text Tools. Pavel Rychlý Mar 3.

IB047. Unix Text Tools. Pavel Rychlý Mar 3. Unix Text Tools pary@fi.muni.cz 2014 Mar 3 Unix Text Tools Tradition Unix has tools for text processing from the very beginning (1970s) Small, simple tools, each tool doing only one operation Pipe (pipeline):

More information

Introduction To Linux. Rob Thomas - ACRC

Introduction To Linux. Rob Thomas - ACRC Introduction To Linux Rob Thomas - ACRC What Is Linux A free Operating System based on UNIX (TM) An operating system originating at Bell Labs. circa 1969 in the USA More of this later... Why Linux? Free

More information

UNIX, GNU/Linux and simple tools for data manipulation

UNIX, GNU/Linux and simple tools for data manipulation UNIX, GNU/Linux and simple tools for data manipulation Dr Jean-Baka DOMELEVO ENTFELLNER BecA-ILRI Hub Basic Bioinformatics Training Workshop @ILRI Addis Ababa Wednesday December 13 th 2017 Dr Jean-Baka

More information

- c list The list specifies character positions.

- c list The list specifies character positions. CUT(1) BSD General Commands Manual CUT(1)... 1 PASTE(1) BSD General Commands Manual PASTE(1)... 3 UNIQ(1) BSD General Commands Manual UNIQ(1)... 5 HEAD(1) BSD General Commands Manual HEAD(1)... 7 TAIL(1)

More information

Basics. I think that the later is better.

Basics.  I think that the later is better. Basics Before we take up shell scripting, let s review some of the basic features and syntax of the shell, specifically the major shells in the sh lineage. Command Editing If you like vi, put your shell

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS 124/LINGUIST 180 From Languages to Information Unix for Poets Dan Jurafsky (original by Ken Church, modifications by Chris Manning) Stanford University Unix for Poets (based on Ken Church s presentation)

More information

CS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky

CS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky CS 124/LINGUIST 180 From Languages to Information Unix for Poets Dan Jurafsky (original by Ken Church, modifications by me and Chris Manning) Stanford University Unix for Poets Text is everywhere The Web

More information

Linux Text Utilities 101 for S/390 Wizards SHARE Session 9220/5522

Linux Text Utilities 101 for S/390 Wizards SHARE Session 9220/5522 Linux Text Utilities 101 for S/390 Wizards SHARE Session 9220/5522 Scott D. Courtney Senior Engineer, Sine Nomine Associates March 7, 2002 http://www.sinenomine.net/ Table of Contents Concepts of the Linux

More information

Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs

Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs Summer 2010 Department of Computer Science and Engineering York University Toronto June 29, 2010 1 / 36 Table of contents 1 2 3 4 2 / 36 Our goal Our goal is to see how we can use Unix as a tool for developing

More information

Introduction To. Barry Grant

Introduction To. Barry Grant Introduction To Barry Grant bjgrant@umich.edu http://thegrantlab.org Working with Unix How do we actually use Unix? Inspecting text files less - visualize a text file: use arrow keys page down/page up

More information

Introduction: What is Unix?

Introduction: What is Unix? Introduction Introduction: What is Unix? An operating system Developed at AT&T Bell Labs in the 1960 s Command Line Interpreter GUIs (Window systems) are now available Introduction: Unix vs. Linux Unix

More information

Useful Unix Commands Cheat Sheet

Useful Unix Commands Cheat Sheet Useful Unix Commands Cheat Sheet The Chinese University of Hong Kong SIGSC Training (Fall 2016) FILE AND DIRECTORY pwd Return path to current directory. ls List directories and files here. ls dir List

More information

Essentials for Scientific Computing: Bash Shell Scripting Day 3

Essentials for Scientific Computing: Bash Shell Scripting Day 3 Essentials for Scientific Computing: Bash Shell Scripting Day 3 Ershaad Ahamed TUE-CMS, JNCASR May 2012 1 Introduction In the previous sessions, you have been using basic commands in the shell. The bash

More information

A Brief Introduction to the Linux Shell for Data Science

A Brief Introduction to the Linux Shell for Data Science A Brief Introduction to the Linux Shell for Data Science Aris Anagnostopoulos 1 Introduction Here we will see a brief introduction of the Linux command line or shell as it is called. Linux is a Unix-like

More information

Chapter 4. Unix Tutorial. Unix Shell

Chapter 4. Unix Tutorial. Unix Shell Chapter 4 Unix Tutorial Users and applications interact with hardware through an operating system (OS). Unix is a very basic operating system in that it has just the essentials. Many operating systems,

More information

Introduction to UNIX. Introduction. Processes. ps command. The File System. Directory Structure. UNIX is an operating system (OS).

Introduction to UNIX. Introduction. Processes. ps command. The File System. Directory Structure. UNIX is an operating system (OS). Introduction Introduction to UNIX CSE 2031 Fall 2012 UNIX is an operating system (OS). Our goals: Learn how to use UNIX OS. Use UNIX tools for developing programs/ software, specifically shell programming.

More information

Introduction to UNIX. CSE 2031 Fall November 5, 2012

Introduction to UNIX. CSE 2031 Fall November 5, 2012 Introduction to UNIX CSE 2031 Fall 2012 November 5, 2012 Introduction UNIX is an operating system (OS). Our goals: Learn how to use UNIX OS. Use UNIX tools for developing programs/ software, specifically

More information

The input can also be taken from a file and similarly the output can be redirected to another file.

The input can also be taken from a file and similarly the output can be redirected to another file. Filter A filter is defined as a special program, which takes input from standard input device and sends output to standard output device. The input can also be taken from a file and similarly the output

More information

Scripting Languages Course 1. Diana Trandabăț

Scripting Languages Course 1. Diana Trandabăț Scripting Languages Course 1 Diana Trandabăț Master in Computational Linguistics - 1 st year 2017-2018 Today s lecture Introduction to scripting languages What is a script? What is a scripting language

More information

Introduction to UNIX command-line II

Introduction to UNIX command-line II Introduction to UNIX command-line II Boyce Thompson Institute 2017 Prashant Hosmani Class Content Terminal file system navigation Wildcards, shortcuts and special characters File permissions Compression

More information

CSE 390a Lecture 2. Exploring Shell Commands, Streams, and Redirection

CSE 390a Lecture 2. Exploring Shell Commands, Streams, and Redirection 1 CSE 390a Lecture 2 Exploring Shell Commands, Streams, and Redirection slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/390a/ 2 Lecture summary Unix

More information

Common File System Commands

Common File System Commands Common File System Commands ls! List names of all files in current directory ls filenames! List only the named files ls -t! List in time order, most recent first ls -l! Long listing, more information.

More information

http://xkcd.com/208/ 1. Review of pipes 2. Regular expressions 3. sed 4. awk 5. Editing Files 6. Shell loops 7. Shell scripts cat seqs.fa >0! TGCAGGTATATCTATTAGCAGGTTTAATTTTGCCTGCACTTGGTTGGGTACATTATTTTAAGTGTATTTGACAAG!

More information

http://xkcd.com/208/ 1. Review of pipes 2. Regular expressions 3. sed 4. Editing Files 5. Shell loops 6. Shell scripts cat seqs.fa >0! TGCAGGTATATCTATTAGCAGGTTTAATTTTGCCTGCACTTGGTTGGGTACATTATTTTAAGTGTATTTGACAAG!

More information

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal Lecture 3 Tonight we dine in shell Hands-On Unix System Administration DeCal 2012-09-17 Review $1, $2,...; $@, $*, $#, $0, $? environment variables env, export $HOME, $PATH $PS1=n\[\e[0;31m\]\u\[\e[m\]@\[\e[1;34m\]\w

More information

UNIX files searching, and other interrogation techniques

UNIX files searching, and other interrogation techniques UNIX files searching, and other interrogation techniques Ways to examine the contents of files. How to find files when you don't know how their exact location. Ways of searching files for text patterns.

More information

Shells & Shell Programming (Part B)

Shells & Shell Programming (Part B) Shells & Shell Programming (Part B) Software Tools EECS2031 Winter 2018 Manos Papagelis Thanks to Karen Reid and Alan J Rosenthal for material in these slides CONTROL STATEMENTS 2 Control Statements Conditional

More information

Pathologically Eclectic Rubbish Lister

Pathologically Eclectic Rubbish Lister Pathologically Eclectic Rubbish Lister 1 Perl Design Philosophy Author: Reuben Francis Cornel perl is an acronym for Practical Extraction and Report Language. But I guess the title is a rough translation

More information

Unix Tools / Command Line

Unix Tools / Command Line Unix Tools / Command Line An Intro 1 Basic Commands / Utilities I expect you already know most of these: ls list directories common options: -l, -F, -a mkdir, rmdir make or remove a directory mv move/rename

More information

BASH SHELL SCRIPT 1- Introduction to Shell

BASH SHELL SCRIPT 1- Introduction to Shell BASH SHELL SCRIPT 1- Introduction to Shell What is shell Installation of shell Shell features Bash Keywords Built-in Commands Linux Commands Specialized Navigation and History Commands Shell Aliases Bash

More information

Review of Fundamentals

Review of Fundamentals Review of Fundamentals 1 The shell vi General shell review 2 http://teaching.idallen.com/cst8207/14f/notes/120_shell_basics.html The shell is a program that is executed for us automatically when we log

More information

File Commands. Objectives

File Commands. Objectives File Commands Chapter 2 SYS-ED/Computer Education Techniques, Inc. 2: 1 Objectives You will learn: Purpose and function of file commands. Interrelated usage of commands. SYS-ED/Computer Education Techniques,

More information

Introduction to Linux

Introduction to Linux Introduction to Linux University of Bristol - Advance Computing Research Centre 1 / 47 Operating Systems Program running all the time Interfaces between other programs and hardware Provides abstractions

More information

Recap From Last Time: Setup Checklist BGGN 213. Todays Menu. Introduction to UNIX.

Recap From Last Time: Setup Checklist   BGGN 213. Todays Menu. Introduction to UNIX. Recap From Last Time: BGGN 213 Introduction to UNIX Barry Grant http://thegrantlab.org/bggn213 Substitution matrices: Where our alignment match and mis-match scores typically come from Comparing methods:

More information

Advanced training. Linux components Command shell. LiLux a.s.b.l.

Advanced training. Linux components Command shell. LiLux a.s.b.l. Advanced training Linux components Command shell LiLux a.s.b.l. alexw@linux.lu Kernel Interface between devices and hardware Monolithic kernel Micro kernel Supports dynamics loading of modules Support

More information

User Commands sed ( 1 )

User Commands sed ( 1 ) NAME sed stream editor SYNOPSIS /usr/bin/sed [-n] script [file...] /usr/bin/sed [-n] [-e script]... [-f script_file]... [file...] /usr/xpg4/bin/sed [-n] script [file...] /usr/xpg4/bin/sed [-n] [-e script]...

More information

The Unix Shell. Pipes and Filters

The Unix Shell. Pipes and Filters The Unix Shell Copyright Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information. shell shell pwd

More information

Part I. UNIX Workshop Series: Quick-Start

Part I. UNIX Workshop Series: Quick-Start Part I UNIX Workshop Series: Quick-Start Objectives Overview Connecting with ssh Command Window Anatomy Command Structure Command Examples Getting Help Files and Directories Wildcards, Redirection and

More information

Utilities. September 8, 2015

Utilities. September 8, 2015 Utilities September 8, 2015 Useful ideas Listing files and display text and binary files Copy, move, and remove files Search, sort, print, compare files Using pipes Compression and archiving Your fellow

More information

CSE 390a Lecture 2. Exploring Shell Commands, Streams, Redirection, and Processes

CSE 390a Lecture 2. Exploring Shell Commands, Streams, Redirection, and Processes CSE 390a Lecture 2 Exploring Shell Commands, Streams, Redirection, and Processes slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/390a/ 1 2 Lecture

More information

Introduction to UNIX Part II

Introduction to UNIX Part II T H E U N I V E R S I T Y of T E X A S H E A L T H S C I E N C E C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S Introduction to UNIX Part II For students

More information

Perl and R Scripting for Biologists

Perl and R Scripting for Biologists Perl and R Scripting for Biologists Lukas Mueller PLBR 4092 Course overview Linux basics (today) Linux advanced (Aure, next week) Why Linux? Free open source operating system based on UNIX specifications

More information

Introduction to Linux

Introduction to Linux Introduction to Linux The command-line interface A command-line interface (CLI) is a type of interface, that is, a way to interact with a computer. Window systems, punched cards or a bunch of dials, buttons

More information

Unix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278

Unix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278 Unix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278 Operating systems The operating system wraps the hardware, running the show and providing abstractions Abstractions of processes

More information

CSCI 4061: Pipes and FIFOs

CSCI 4061: Pipes and FIFOs 1 CSCI 4061: Pipes and FIFOs Chris Kauffman Last Updated: Thu Oct 26 12:44:54 CDT 2017 2 Logistics Reading Robbins and Robbins Ch 6.1-6.5 OR Stevens/Rago Ch 15.1-5 Goals Sending Signals in C Signal Handlers

More information

CSE II-Sem)

CSE II-Sem) 1 2 a) Login to the system b) Use the appropriate command to determine your login shell c) Use the /etc/passwd file to verify the result of step b. d) Use the who command and redirect the result to a file

More information

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p. Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p. 2 Conventions Used in This Book p. 2 Introduction to UNIX p. 5 An Overview

More information

Practical Session 0 Introduction to Linux

Practical Session 0 Introduction to Linux School of Computer Science and Software Engineering Clayton Campus, Monash University CSE2303 and CSE2304 Semester I, 2001 Practical Session 0 Introduction to Linux Novell accounts. Every Monash student

More information

Shell Programming Systems Skills in C and Unix

Shell Programming Systems Skills in C and Unix Shell Programming 15-123 Systems Skills in C and Unix The Shell A command line interpreter that provides the interface to Unix OS. What Shell are we on? echo $SHELL Most unix systems have Bourne shell

More information

ITST Searching, Extracting & Archiving Data

ITST Searching, Extracting & Archiving Data ITST 1136 - Searching, Extracting & Archiving Data Name: Step 1 Sign into a Pi UN = pi PW = raspberry Step 2 - Grep - One of the most useful and versatile commands in a Linux terminal environment is the

More information

http://xkcd.com/208/ cat seqs.fa >0 TGCAGGTATATCTATTAGCAGGTTTAATTTTGCCTGCACTTGGTTGGGTACATTATTTTAAGTGTATTTGACAAG >1 TGCAGGTTGTTGTTACTCAGGTCCAGTTCTCTGAGACTGGAGGACTGGGAGCTGAGAACTGAGGACAGAGCTTCA >2 TGCAGGGCCGGTCCAAGGCTGCATGAGGCCTGGGGCAGAATCTGACCTAGGGGCCCCTCTTGCTGCTAAAACCAT

More information

More text file manipulation: sorting, cutting, pasting, joining, subsetting,

More text file manipulation: sorting, cutting, pasting, joining, subsetting, More text file manipulation: sorting, cutting, pasting, joining, subsetting, Laboratory of Genomics & Bioinformatics in Parasitology Department of Parasitology, ICB, USP Inverse cat Last week we learned

More information

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny. Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny stefano.gaiarsa@unimi.it Linux and the command line PART 1 Survival kit for the bash environment Purpose of the

More information

Module 8 Pipes, Redirection and REGEX

Module 8 Pipes, Redirection and REGEX Module 8 Pipes, Redirection and REGEX Exam Objective 3.2 Searching and Extracting Data from Files Objective Summary Piping and redirection Partial POSIX Command Line and Redirection Command Line Pipes

More information

Linux Shell Scripting. Linux System Administration COMP2018 Summer 2017

Linux Shell Scripting. Linux System Administration COMP2018 Summer 2017 Linux Shell Scripting Linux System Administration COMP2018 Summer 2017 What is Scripting? Commands can be given to a computer by entering them into a command interpreter program, commonly called a shell

More information

Laboratory 1 Semester 1 11/12

Laboratory 1 Semester 1 11/12 CS2106 National University of Singapore School of Computing Laboratory 1 Semester 1 11/12 MATRICULATION NUMBER: In this lab exercise, you will get familiarize with some basic UNIX commands, editing and

More information

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th Unix Essentials BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th 2016 http://barc.wi.mit.edu/hot_topics/ 1 Outline Unix overview Logging in to tak Directory structure

More information

http://xkcd.com/208/ 1. Computer Hardware 2. Review of pipes 3. Regular expressions 4. sed 5. awk 6. Editing Files 7. Shell loops 8. Shell scripts Hardware http://www.theverge.com/2011/11/23/2582677/thailand-flood-seagate-hard-drive-shortage

More information

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois Unix/Linux Primer Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois August 25, 2017 This primer is designed to introduce basic UNIX/Linux concepts and commands. No

More information

Computer Systems and Architecture

Computer Systems and Architecture Computer Systems and Architecture Introduction to UNIX Stephen Pauwels University of Antwerp October 2, 2015 Outline What is Unix? Getting started Streams Exercises UNIX Operating system Servers, desktops,

More information

CST Lab #5. Student Name: Student Number: Lab section:

CST Lab #5. Student Name: Student Number: Lab section: CST8177 - Lab #5 Student Name: Student Number: Lab section: Working with Regular Expressions (aka regex or RE) In-Lab Demo - List all the non-user accounts in /etc/passwd that use /sbin as their home directory.

More information

538 Text processing basics

538 Text processing basics 538 Text processing basics Jianguo Lu, University of Windsor September 12, 2018 Lu September 12, 2018 1 / 26 Table of contents 1 Unix commands grep command join command Lu September 12, 2018 2 / 26 View

More information

Getting to grips with Unix and the Linux family

Getting to grips with Unix and the Linux family Getting to grips with Unix and the Linux family David Chiappini, Giulio Pasqualetti, Tommaso Redaelli Torino, International Conference of Physics Students August 10, 2017 According to the booklet At this

More information

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program Using UNIX. UNIX is mainly a command line interface. This means that you write the commands you want executed. In the beginning that will seem inferior to windows point-and-click, but in the long run the

More information

EECS 470 Lab 5. Linux Shell Scripting. Friday, 1 st February, 2018

EECS 470 Lab 5. Linux Shell Scripting. Friday, 1 st February, 2018 EECS 470 Lab 5 Linux Shell Scripting Department of Electrical Engineering and Computer Science College of Engineering University of Michigan Friday, 1 st February, 2018 (University of Michigan) Lab 5:

More information

UNIX II:grep, awk, sed. October 30, 2017

UNIX II:grep, awk, sed. October 30, 2017 UNIX II:grep, awk, sed October 30, 2017 File searching and manipulation In many cases, you might have a file in which you need to find specific entries (want to find each case of NaN in your datafile for

More information

Lab 2: Linux/Unix shell

Lab 2: Linux/Unix shell Lab 2: Linux/Unix shell Comp Sci 1585 Data Structures Lab: Tools for Computer Scientists Outline 1 2 3 4 5 6 7 What is a shell? What is a shell? login is a program that logs users in to a computer. When

More information

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80 CSE 303 Lecture 2 Introduction to bash shell read Linux Pocket Guide pp. 37-46, 58-59, 60, 65-70, 71-72, 77-80 slides created by Marty Stepp http://www.cs.washington.edu/303/ 1 Unix file system structure

More information

Part III. Shell Config. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26,

Part III. Shell Config. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26, Part III Shell Config Compact Course @ Max-Planck, February 16-26, 2015 33 Special Directories. current directory.. parent directory ~ own home directory ~user home directory of user ~- previous directory

More information

Shell Programming Overview

Shell Programming Overview Overview Shell programming is a way of taking several command line instructions that you would use in a Unix command prompt and incorporating them into one program. There are many versions of Unix. Some

More information

Shells and Shell Programming

Shells and Shell Programming Shells and Shell Programming 1 Shells A shell is a command line interpreter that is the interface between the user and the OS. The shell: analyzes each command determines what actions are to be performed

More information

CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater)

CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater) CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater) 28 January 2013 Jason Yosinski Original slides available under Creative Commons Attribution-ShareAlike 3.0

More information

CSC 2500: Unix Lab Fall 2016

CSC 2500: Unix Lab Fall 2016 CSC 2500: Unix Lab Fall 2016 IO Redirection Mohammad Ashiqur Rahman Department of Computer Science College of Engineering Tennessee Tech University Agenda Standard IO IO Redirection Pipe Various File Processing

More information

Unix as a Platform Exercises + Solutions. Course Code: OS 01 UNXPLAT

Unix as a Platform Exercises + Solutions. Course Code: OS 01 UNXPLAT Unix as a Platform Exercises + Solutions Course Code: OS 01 UNXPLAT Working with Unix Most if not all of these will require some investigation in the man pages. That's the idea, to get them used to looking

More information

Bashed One Too Many Times. Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009

Bashed One Too Many Times. Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009 Bashed One Too Many Times Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009 What is a Shell? The shell interprets commands and executes them It provides you with an environment

More information

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1 Review of Fundamentals Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 GPL the shell SSH (secure shell) the Course Linux Server RTFM vi general shell review 2 These notes are available on

More information

Shells and Shell Programming

Shells and Shell Programming Shells and Shell Programming Shells A shell is a command line interpreter that is the interface between the user and the OS. The shell: analyzes each command determines what actions are to be performed

More information

Introduction. File System. Note. Achtung!

Introduction. File System. Note. Achtung! 3 Unix Shell 1: Introduction Lab Objective: Explore the basics of the Unix Shell. Understand how to navigate and manipulate file directories. Introduce the Vim text editor for easy writing and editing

More information

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010 Fall 2010 Lecture 5 Hussam Abu-Libdeh based on slides by David Slater September 17, 2010 Reasons to use Unix Reason #42 to use Unix: Wizardry Mastery of Unix makes you a wizard need proof? here is the

More information

CSE 391 Lecture 3. bash shell continued: processes; multi-user systems; remote login; editors

CSE 391 Lecture 3. bash shell continued: processes; multi-user systems; remote login; editors CSE 391 Lecture 3 bash shell continued: processes; multi-user systems; remote login; editors slides created by Marty Stepp, modified by Jessica Miller and Ruth Anderson http://www.cs.washington.edu/391/

More information

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version... Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing

More information

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction

More information

Reading and manipulating files

Reading and manipulating files Reading and manipulating files Goals By the end of this lesson you will be able to Read files without using text editors Access specific parts of files Count the number of words and lines in a file Sort

More information

do shell script in AppleScript

do shell script in AppleScript Technical Note TN2065 do shell script in AppleScript This Technote answers frequently asked questions about AppleScript s do shell script command, which was introduced in AppleScript 1.8. This technical

More information

bash, part 3 Chris GauthierDickey

bash, part 3 Chris GauthierDickey bash, part 3 Chris GauthierDickey More redirection As you know, by default we have 3 standard streams: input, output, error How do we redirect more than one stream? This requires an introduction to file

More information

Dalhousie University CSCI 2132 Software Development Winter 2018 Lab 2, January 25

Dalhousie University CSCI 2132 Software Development Winter 2018 Lab 2, January 25 Dalhousie University CSCI 2132 Software Development Winter 2018 Lab 2, January 25 In this lab, you will first learn autocompletion, a feature of the Bash shell. You will also learn more about the command

More information

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University Unix/Linux Basics 1 Some basics to remember Everything is case sensitive Eg., you can have two different files of the same name but different case in the same folder Console-driven (same as terminal )

More information

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction

More information

Bourne Shell Reference

Bourne Shell Reference > Linux Reviews > Beginners: Learn Linux > Bourne Shell Reference Bourne Shell Reference found at Br. David Carlson, O.S.B. pages, cis.stvincent.edu/carlsond/cs330/unix/bshellref - Converted to txt2tags

More information

CSE 391 Lecture 1. introduction to Linux/Unix environment

CSE 391 Lecture 1. introduction to Linux/Unix environment CSE 391 Lecture 1 introduction to Linux/Unix environment slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/391/ 1 2 Lecture summary Course introduction

More information

Computer Systems and Architecture

Computer Systems and Architecture Computer Systems and Architecture Stephen Pauwels Computer Systems Academic Year 2018-2019 Overview of the Semester UNIX Introductie Regular Expressions Scripting Data Representation Integers, Fixed point,

More information

Introduction to Unix Week 3

Introduction to Unix Week 3 Week 3 cat [file ] Display contents of files cat sample.file This is a sample file that i'll use to demo how the pr command is used. The pr command is useful in formatting various types of text files.

More information

Lab #8: Introduction to UNIX and GMT

Lab #8: Introduction to UNIX and GMT Geol 335.3 1 Lab #8: Introduction to UNIX and GMT In this lab, you ll familiarize yourself with some of the leading components of scientific computing: UNIX operating system, and a free, open-source, GIS/plotting

More information

Introduction to Bash Programming. Dr. Xiaolan Zhang Spring 2013 Dept. of Computer & Information Sciences Fordham University

Introduction to Bash Programming. Dr. Xiaolan Zhang Spring 2013 Dept. of Computer & Information Sciences Fordham University Introduction to Bash Programming Dr. Xiaolan Zhang Spring 2013 Dept. of Computer & Information Sciences Fordham University 1 Outline Shell command line syntax Shell builtin commands Shell variables, arguments

More information

CS 25200: Systems Programming. Lecture 11: *nix Commands and Shell Internals

CS 25200: Systems Programming. Lecture 11: *nix Commands and Shell Internals CS 25200: Systems Programming Lecture 11: *nix Commands and Shell Internals Dr. Jef Turkstra 2018 Dr. Jeffrey A. Turkstra 1 Lecture 11 Shell commands Basic shell internals 2018 Dr. Jeffrey A. Turkstra

More information

CSE 391 Lecture 1. introduction to Linux/Unix environment

CSE 391 Lecture 1. introduction to Linux/Unix environment CSE 391 Lecture 1 introduction to Linux/Unix environment slides created by Marty Stepp, modified by Jessica Miller & Ruth Anderson http://www.cs.washington.edu/391/ 1 2 Lecture summary Course introduction

More information