Large-Scale Networks

Similar documents
STA141C: Big Data & High Performance Statistical Computing

The current topic: Python. Announcements. Python. Python

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital

Python Basics. Lecture and Lab 5 Day Course. Python Basics

Accelerating Information Technology Innovation

1. BASICS OF PYTHON. JHU Physics & Astronomy Python Workshop Lecturer: Mubdi Rahman

CS S-02 Python 1. Most python references use examples involving spam, parrots (deceased), silly walks, and the like

Senthil Kumaran S

Introduction to programming with Python

1 Truth. 2 Conditional Statements. Expressions That Can Evaluate to Boolean Values. Williams College Lecture 4 Brent Heeringa, Bill Jannen

Introduction to Web Scraping with Python

AI Programming CS S-02 Python

Accelerating Information Technology Innovation

Beyond Blocks: Python Session #1

Key Differences Between Python and Java

A Little Python Part 2

Introduction to Python

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

Outline. Simple types in Python Collections Processing collections Strings Tips. 1 On Python language. 2 How to use Python. 3 Syntax of Python

Scripting Languages. Python basics

Data Mining - Foursquare II. Bruno Gonçalves

Introduction to Machine Learning. Useful tools: Python, NumPy, scikit-learn

Introduction to Python

egrapher Language Reference Manual

Introduction to Python

CME 193: Introduction to Scientific Python Lecture 1: Introduction

Table of Contents. Dive Into Python...1

And Parallelism. Parallelism in Prolog. OR Parallelism

CS Programming Languages: Python

Loops and Conditionals. HORT Lecture 11 Instructor: Kranthi Varala

Introduction to Python

STEAM Clown & Productions Copyright 2017 STEAM Clown. Page 1

CS Advanced Unix Tools & Scripting

About Python. Python Duration. Training Objectives. Training Pre - Requisites & Who Should Learn Python

Introduction to Scientific Python, CME 193 Jan. 9, web.stanford.edu/~ermartin/teaching/cme193-winter15

Session 1: Introduction to Python from the Matlab perspective. October 9th, 2017 Sandra Diaz

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

Introduction to Python

Introduction to Python

Getting Started. Office Hours. CSE 231, Rich Enbody. After class By appointment send an . Michigan State University CSE 231, Fall 2013

Programming to Python

Module 04: Lists. Topics: Lists and their methods Mutating lists Abstract list functions Readings: ThinkP 8, 10. CS116 Fall : Lists


Variable and Data Type I

CIS192: Python Programming

Python for Earth Scientists

An Introduction to Python

Course Title: Python + Django for Web Application

Introduction to Bioinformatics

ENGR 102 Engineering Lab I - Computation

Pace University. Fundamental Concepts of CS121 1

PRG PROGRAMMING ESSENTIALS. Lecture 2 Program flow, Conditionals, Loops

Control Structures 1 / 17

Introduction to Python

Table of Contents EVALUATION COPY

Introduction to: Computers & Programming: Review prior to 1 st Midterm

PHY224 Practical Physics I Python Review Lecture 1 Sept , 2013

CMSC201 Computer Science I for Majors

Scientific Computing: Lecture 1

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

Basic Python 3 Programming (Theory & Practical)

STSCI Python Introduction. Class URL

Multimedia-Programmierung Übung 1

Begin to code with Python Obtaining MTA qualification expanded notes

COMP1730/COMP6730 Programming for Scientists. Data: Values, types and expressions.

06/11/2014. Subjects. CS Applied Robotics Lab Gerardo Carmona :: makeroboticsprojects.com June / ) Beginning with Python

GE PROBLEM SOVING AND PYTHON PROGRAMMING. Question Bank UNIT 1 - ALGORITHMIC PROBLEM SOLVING

CSCE 110 Programming I

2.Raspberry PI: Architecture & Hardware Specifications

Python I. Some material adapted from Upenn cmpe391 slides and other sources

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

Part III Appendices 165

CS313D: ADVANCED PROGRAMMING LANGUAGE

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

CSc 120. Introduc/on to Computer Programing II. Adapted from slides by Dr. Saumya Debray. 01- a: Python review

ARTIFICIAL INTELLIGENCE AND PYTHON

Variable and Data Type I

CIS192 Python Programming

DaMPL. Language Reference Manual. Henrique Grando

ENGR 101 Engineering Design Workshop

THE IF STATEMENT. The if statement is used to check a condition: if the condition is true, we run a block

Introduction to Python

Python Programming Exercises 1

python 01 September 16, 2016

Introduction to Corpora

GIS 4653/5653: Spatial Programming and GIS. More Python: Statements, Types, Functions, Modules, Classes

ECE 364 Software Engineering Tools Lab. Lecture 3 Python: Introduction

ARG! Language Reference Manual

PYTHON FOR KIDS A Pl ayfu l I ntrodu ctio n to Prog r am m i ng J a s o n R. B r i g g s

CIS192 Python Programming. Robert Rand. August 27, 2015

C++ for Python Programmers

Lecture 12. PHP. cp476 PHP

Some material adapted from Upenn cmpe391 slides and other sources

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Full file at

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

Course May 18, Advanced Computational Physics. Course Hartmut Ruhl, LMU, Munich. People involved. SP in Python: 3 basic points

The Pyth Language. Administrivia

APIs and API Design with Python

Introduction to Python

Transcription:

Large-Scale Networks 3b Python for large-scale networks Dr Vincent Gramoli Senior lecturer School of Information Technologies The University of Sydney Page 1

Introduction

Why Python? What to do with Python? To get data (e.g., crawling online social networks) Manipulate and process data Visualize results (as well as understand and communicate results) Why Python? Rich collection of already existing building blocks (e.g., NetworkX) Open source free software Easy to learn The University of Sydney Page 3

What is Python? What is Python? An interpreted language Multi-platform tool: available for Windows, Linux, Mac OS X, Android Can be interfaced with other languages (e.g., C/C++) Write directly in the interpreter Interpret a python file The University of Sydney Page 4

Getting started Start the interpreter >>> print Hello, world! Hello, world! Do not declare type before assigning its value >>> a = 3 >>> b = 2*a >>> type(b) <type int > >>> print b 6 >>> a*b 18 String operands >>> b = hello >>> type(b) <type str > >>> b+b hellohello >>> 2*b hellohello The University of Sydney Page 5

Scalar Types

Scalar types Integer >>> 1 + 1 2 >>> a = 4 >>> type(a) <type int > Floats >>> c = 2.1 >>> type(c) <type float > Casting >>> float(1) 1.0 Complex >>> a = 1.5 + 0.5j >>> a.real 1.5 >>> a.imag 0.5 >>> type(1. + 0j) <type complex > Booleans >>> 3 > 4 False >>> test = (3 > 4) >>> test False >>> type(test) <type bool > The University of Sydney Page 7

Divisions Integer division >>> 3/2 1 Use floats >>> 3/2. 1.5 >>> a = 3 >>> b = 2 >>> a / b 1 >>> a / float(b) 1.5 Explicit integer division >>> 3.0 // 2 1.0 The University of Sydney Page 8

Lists and Strings

Lists Lists >>> l = [ red, blue, green, black, white ] >>> type(l) <type list > Accessing list elements >>> l[2] green >>> l[-1] white >>> l[3] black >>> l [ red, blue, green, black, white ] Slicing syntax L[start:stop:stride] >>> l[3:] [ black, white ] >>> l[:3] [ red, blue, green ] >>> l[::2] [ red, green, white ] Lists are mutable objects >>> l[2:4] = [ gray, purple ] >>> l [ red, blue, gray, purple, white ] The type of list elements may differ >>> l = [3, -200, hello ] The University of Sydney Page 10

Lists Add and remove elements >>> l = [ red, blue, green, black, white ] >>> l.append( pink ) [ red, blue, green, black, white, pink ] >>> l.pop() pink >>> l [ red, blue, green, black, white ] >>> l.extend([ pink, purple ]) [ red, blue, green, black, white, pink, purple ] >>> l = l[:-2] >>> l [ red, blue, green, black, white ] Reverse >>> r = l[::-1] >>> r [ white, black, green, blue, red ] >>> r2 = list(l) >>> r2 [ red, blue, green, black, white ] Sort >>> sorted(r) [ black, blue, green, red, white ] The University of Sydney Page 11

String Different syntaxes >>> s = Hello, how are you? >>> s = Hi, what s up? >>> s = Hello, how are you >>> s = Hi what s up Strings can be indexed/sliced like lists >>> a = hello >>> a[0] h >>> a = hello, world! >>> a[3:6] lo, Dictionary >>> tel = { alice : 1234, bob :2345} >>> tel[ carole ] = 3456 >>> tel {'bob': 2345, 'alice': 1234, 'carole': 3456} >>> tel[ bob ] 2345 >>> tel.keys() { bob, alice, carole } >>> tel.values() [2345, 1234, 3456] >>> carole in tel True The University of Sydney Page 12

Control Flow

Control flow Blocks are delimited by indentation! If/elif/else >>> if 2**2 == 4... print Obvious! Obvious! >>> a = 10 >>> if a == 1:... print(1)... elif a == 2:... print(2)... else:... print( a lot ) a lot For/range >>> for i in range(4):... print(i) 0 1 2 3 >>> for word in ( cool, powerful, simple ):... print( Python is %s % word) Python is cool Python is powerful Python is simple The University of Sydney Page 14

Control flow Python supports - while - break - Continue Conditional expressions - if <object> - Evaluates to false: - Any number equal to zero (0, 0.0, 0+0j) - An empty container (list, tuple, dictionary ) - Evaluates to true - everything else Tests a == b tests equality, with logics a is b tests identity a in b tests containment (If b is a dictionary, containment tests whether a is a key of b) Iterator over list/string/keys/ lines >>> for i in powerful :... if I in vowels:... print(i), o e u The University of Sydney Page 15

Functions and Modules

Functions Functions are delimited by indentation! >>> def test():... print( in test funct ) >>> test() in test funct They can return values >>> def test():... return 2*2 >>> test() 4 They can take arguments >>> def test(x):... Return x*2 >>> test(4) 8 They can take optional args >>> def test(x=2):... Return x*2 >>> test() 4 The University of Sydney Page 17

Modules Importing objects from modules >>> import os >>> os.listdir(. ) lists the content of current directory Use import * with caution >>> from os import * it makes it hard to track symbols Importing a single function >>> from os import listdir The University of Sydney Page 18

Modules Here is the list of modules we will use: - Python-twitter (module twitter) https://pypi.python.org/pypi/twitter - NetworkX (module networkx) https://networkx.github.io/ Version of Python recommended (v2.7) - Anaconda http://continuum.io/downloads - Canopy https://www.enthought.com/products/epd/ Other modules of interests: - Simplejson: JSON encoder/decoder https://pypi.python.org/pypi/simplejson/ - Requests: HTTP requests https://pypi.python.org/pypi/requests - BeautifulSoup: parse documents and find links: http://www.crummy.com/software/beautifulsoup/ - python-instagram: API for instagram https://github.com/instagram/python-instagram The University of Sydney Page 19

Conclusion

Conclusion We will use python To obtain network information (e.g., Twitter API) To analyze networks (e.g., NetworkX) Because There are many modules Free and open source Python is simple The University of Sydney Page 21

Going further Python scientific: - http://scipy-lectures.github.io/_downloads/pythonscientific-simple.pdf NetworkX reference: - http://networkx.github.io/documentation/latest/_downloads/networkx_reference.pdf The University of Sydney Page 22

Other Modules

HTTP headers Find the final location of the shortened url: http://bit.ly/googlescholar import urllib url = "http://bit.ly/googlescholar" r = urllib.request.urlopen(url) print(r.geturl()) url = r.geturl() r = urllib.request.urlopen(url) print(r.geturl()) print(r.info()) print(r.headers.items()) # Remove the "print" # to display this in a nicer way (in ipython) The University of Sydney Page 24

HTML parsing Extract the title of Feynman s 100 most cited papers from Google Scholar import requests from bs4 import BeautifulSoup url = "http://scholar.google.com/citations?user=\ B7vSqZsAAAAJ&hl=fr&cstart=0&pagesize=100" headers = { headers = {"User-agent" : # otherwise google scholar won t answer "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0)\ Gecko/20100101 Firefox/25.0"} } request = requests.get(url, headers = headers) soup = BeautifulSoup(request.text) table = soup.find("table", attrs={"class" : "cit-table"}) for paper in table.findall("td", attrs={"id" : "coltitle"}): print(paper.a.string) The University of Sydney Page 25

JSON Download the compressed JSON file: http://data.githubarchive.org/ 2015-01-01-15.json.gz. Write a script that reads the file and extracts all the actors id and repository id pairs. import gzip import sys import json t = gzip.open(sys.argv[1]).read() \ # Open the file (in bytes mode).decode() # Convert the text to string d = [json.loads(x) # Load the data... for x in t.split( \n ) #... for each line in the file... if x] #... that is not empty for item in d: print( %s %s % (item[ actor ][ id ], item[ repo ][ id ])) The University of Sydney Page 26