Command Line and Python Introduction. Jennifer Helsby, Eric Potash Computation for Public Policy Lecture 2: January 7, 2016

Similar documents
ARTIFICIAL INTELLIGENCE AND PYTHON

Programming for Data Science Syllabus

Python for Data Analysis

Data Science with Python Course Catalog

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

pandas: Rich Data Analysis Tools for Quant Finance

Week Two. Arrays, packages, and writing programs

Lab of COMP 406. MATLAB: Quick Start. Lab tutor : Gene Yu Zhao Mailbox: or Lab 1: 11th Sep, 2013

HANDS ON DATA MINING. By Amit Somech. Workshop in Data-science, March 2016

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

DATA STRUCTURE AND ALGORITHM USING PYTHON

Scientific Python. 1 of 10 23/11/ :00

CME 193: Introduction to Scientific Python Lecture 1: Introduction

Scientific Computing: Lecture 1

AUTHORS: FERNANDO PEREZ BRIAN E GRANGER (IEEE 2007) PRESENTED BY: RASHMISNATA ACHARYYA

COSC 490 Computational Topology

Computational Programming with Python

Fundamentals: Expressions and Assignment

CSI Lab 02. Tuesday, January 21st

About Intellipaat. About the Course. Why Take This Course?

Chapter 2 Working with Data Types and Operators

Programming Fundamentals and Python

UNIX, GNU/Linux and simple tools for data manipulation

Certified Data Science with Python Professional VS-1442

Introduction to Python Part 1. Brian Gregor Research Computing Services Information Services & Technology

Phys Techniques of Radio Astronomy Part 1: Python Programming

Pandas plotting capabilities

Data Analyst Nanodegree Syllabus

Working with Analytical Objects. Version: 16.0

R in Linguistic Analysis. Week 2 Wassink Autumn 2012

Matplotlib Python Plotting

Pandas and Friends. Austin Godber Mail: Source:

Automation.

David J. Pine. Introduction to Python for Science & Engineering

SAS and Python: The Perfect Partners in Crime

Scientific Computing using Python

ITST Searching, Extracting & Archiving Data

Python Certification Training

Basics of Stata, Statistics 220 Last modified December 10, 1999.

Introducing Python Pandas

Physics 514 Basic Python Intro

ONE DIMENSIONAL ARRAYS

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital

Scientific computing platforms at PGI / JCNS

Pandas. Data Manipulation in Python

Python for Astronomers. Week 1- Basic Python

DSC 201: Data Analysis & Visualization

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it.

5. Excel Fundamentals

UNIX II:grep, awk, sed. October 30, 2017

ECO375 Tutorial 1 Introduction to Stata

Week - 01 Lecture - 04 Downloading and installing Python

Essentials for Scientific Computing: Bash Shell Scripting Day 3

Webgurukul Programming Language Course

CS Summer 2013

NOTES ON RUNNING PYTHON CODE

The Python interpreter

Webinar Series. Introduction To Python For Data Analysis March 19, With Interactive Brokers

Day 15: Science Code in Python

Python With Data Science

LAB #1: DESCRIPTIVE STATISTICS WITH R

13 File Structures. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Introduction to Scientific Python, CME 193 Jan. 9, web.stanford.edu/~ermartin/teaching/cme193-winter15

DSC 201: Data Analysis & Visualization

Topic 7: Lists, Dictionaries and Strings

Ch.1 Introduction. Why Machine Learning (ML)?

Python & Spark PTT18/19

The Cantor Handbook. Alexander Rieder

Often, more information is required when designing system call Information varies according to OS and types of system call

ENGG1811 Computing for Engineers Week 1 Introduction to Programming and Python

CS112 Lecture: Primitive Types, Operators, Strings

Contents. Lezione 3. Reference sources. Contents

Visual Basic for Applications

Introduction to MATLAB

Programming for Engineers in Python

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

Cosmology with python: Beginner to Advanced in one week. Tiago Batalha de Castro

Chapter 17. Fundamental Concepts Expressed in JavaScript

Sample Data. Sample Data APPENDIX A. Downloading the Sample Data. Images. Sample Databases

Data Analyst Nanodegree Syllabus

Introduction to Python: Data types. HORT Lecture 8 Instructor: Kranthi Varala

Information Science 1

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Worksheet 6: Basic Methods Methods The Format Method Formatting Floats Formatting Different Types Formatting Keywords

ST. MARY S COLLEGE FORM 4

ENGG1811 Computing for Engineers Week 9A: File handling

Reports, Graphs and Queries 1

Advanced Econometric Methods EMET3011/8014

Python Advance Course via Astronomy street. Sérgio Sousa (CAUP) ExoEarths Team (

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

PYTHON DATA VISUALIZATIONS

Introduction to Programming with Python 3, Ami Gates. Chapter 1: Creating a Programming Environment

QuickStart Training Guide: The Accounting Review Role

Episode 8 Matplotlib, SciPy, and Pandas. We will start with Matplotlib. The following code makes a sample plot.

JME Language Reference Manual

Data Acquisition and Processing

ERTH3021 Exploration and Mining Geophysics

CPSC 67 Lab #5: Clustering Due Thursday, March 19 (8:00 a.m.)

JavaScript CS 4640 Programming Languages for Web Applications

Pandas. Data Manipulation in Python

Transcription:

Command Line and Python Introduction Jennifer Helsby, Eric Potash Computation for Public Policy Lecture 2: January 7, 2016

Today Assignment #1! Computer architecture Basic command line skills Python fundamentals Variables and types Basic data structures First programs Basic plotting with Python s matplotlib

Assignment #1 Goal: Get familiar with Python. Learn to do simple Python programming and make simple plots Use crime statistics data from the City of Chicago data portal Due in 1 week: January 14, 2016 Submit via email to the course email account

Class Virtual Machine Class VM download: Link on Chalk Load with VirtualBox (available for Mac, Windows, GNU/Linux) Feel free to install software yourself

Hardware RAM HDD CPU I/O

Software user 1 user 2 user 3... user n system and application programs operating system hardware Adapted from Silberschatz, Galvin and Gagne

Operating System Does things that the user doesn t need/want to deal with Makes the system more efficient and convenient to use Intermediary between the hardware and user

Unix and its variants Unix is family of extremely popular operating systems Variants: GNU/Linux Mac OS X BSD Unix

http://www.computerworld.com/article/2524660/operating-systems/the-unix-family-tree.html

Command Line: Mac OS X

Command Line: Debian GNU/Linux

Command Line Syntax

Command Line: Basic Unix Commands 1 : List contents of directory : Print current working directory : Change directory : Make new directory : Copy a file or directory (with ) : print to screen (really concatenate)

Command Line: Basic Unix Commands 2 : Remove file : Remove directory : Text editor in terminal : Get file from the web Use rm with caution!!

Navigating Files Look at first few or last few lines using and : (especially important for very large files) Search through files looking for certain strings using :

Notation Default directory is a user s home directory, e.g. on the VM Current directory Parent directory

Notation Default directory is a user s home directory, e.g. on the VM Current directory Parent directory What directory is?

Redirecting Output Send output of as input to Send output of to a new file Append output of to an existing file

Example: Redirection Download the current employee names, salaries, and position titles of city employeesfrom the City of Chicago data portal ( ) Let s count the number of police officers in the dataset using grep and redirection: counts the number of lines Can also redirect to and page through the results:

Example: Redirection Take only the detectives and redirect the output to a new text file:

Python Popular and easy to learn interpreted language Two versions of Python commonly used: 2.7x - Run with and 3.x (recommended) - Run with and Running Python code: From command line From interpreter

Python: Interpreter Quit Python with or CTRL-D

Python: Command Line

Python: Basic Data Types Integer: String: List: Floating point: Boolean:, Dictionary: or or

Python: Variables Assign values to variables using the assignment operator (=):

Python: Arithmetic Simple integer arithmetic: Arithmetic operators: (discard remainder) ** (raise to the power) (mod, return remainder of division),

Python: String Manipulation Assign a string using : Indexing a string using square brackets: index 0 1 2 3 4 dog_name R o v e r

Python: String Manipulation Concatenate (join strings together) with and repeat them N times with :

Python: String Methods 1 Split a string on spaces using Split on another character, :, using :

Python: String Methods 2 Return the uppercase and lowercase versions of a string with : Replace a substring with another using : and

Python: Lists Assign lists with comma-separated values in square brackets: Index with square brackets: index 0 1 2 3 4 favorite_numbers 2 3 5 7 11

Python: Length of list or string Determine length of list with: Similarly with strings:

Python: List Methods 1 Append a new value with or multiple values with : index 0 1 2 3 4 5 favorite_numbers 2 3 5 7 11 13 And then pop them off with :

Python: List Methods 2 Reverse list with Sort lists with : :

Python: Dictionaries Dictionaries are key value stores Define using: Reference a given entry using square brackets

Python: Boolean Comparison operators: (not equal to) Combine with and (less than or equal to), :, (equal to),,,

Python: Data Types Determine data types using :

Python: Loading Python modules Wide array of modules in Python available for almost any computational task Use existing code in these libraries when you can to prevent reinventing the wheel Libraries are imported into the Python environment using the statement:

Other Python Tidbits Whitespace (tabs, spaces) is meaningful Will talk more about this next time Comment with

Important Python Packages for Scientific Computing Scipy: Numpy: Library for fast computations on large arrays pandas: Library of codes for scientific computing (linear algebra, stats, interpolation,...) Popular data analysis toolkit matplotlib Most popular package for plotting

Numpy ndarray : n-dimensional array Useful for linear algebra and image manipulation

Numpy: Useful Functions : Makes linearly spaced bins Generate random numbers with Make specific matrices with : and :

IPython Notebook Useful way of doing data analysis (collaborative) Launch with: Can do Python programming in browser environment

IPython: Integrating plotting Enable plotting inside the notebook with:

Pandas Extremely useful data analysis package perfect for analysis in social science, policy, etc. Two fundamental objects: DataFrame: 2D object Series: 1D object Nicely handles missing data, alignment of data, merging and joining datasets, handling timeseries data

Pandas: Basic Data Handling Read in a CSV file into a Pandas DataFrame: Pandas DataFrame:

Pandas: Columns (Series) Refer to a single column using square brackets:

Pandas: Munging Data Need floating point numbers only so that we can make a histogram of salaries Problem 1: NaNs Problem 2: Strings

Pandas: Munging Data Solution 1: Use to drop rows with NaNs: Solution 2: Strip off dollar signs with point with : We can now visualize this data! and represent as floating

Matplotlib Basic Plotting: Histogram Histogram with :

Matplotlib Basic Plotting: Histogram Histogram with :

Pandas plotting with : Bar Chart

Pandas Supported Plot Types Pass string to to tell pandas which plot to make Supported plot types you will find useful: Bar charts: bar or barh Histograms: hist Scatter plots: scatter Pie charts: pie