import matplotlib as mpl # As of July 2017 Bucknell computers use v. 2.x import matplotlib.pyplot as plt

Similar documents
For examples, documentation, tutorials, etc, see Astropy at ( # For retrieving an image from a URL

Problems from Hughes and Hase

Nonlinear curve-fitting example

import matplotlib as mpl # As of July 2017 Bucknell computers use v. 2.x import matplotlib.pyplot as plt

CS 237 Fall 2018, Homework 08 Solution

More on Curve Fitting

LECTURE 22. Numerical and Scientific Computing Part 2

4. BASIC PLOTTING. JHU Physics & Astronomy Python Workshop Lecturer: Mubdi Rahman

Tutorial 2 PHY409 Anadi Canepa Office, TRIUMF MOB 92 B ( )

PYTHON DATA VISUALIZATIONS

Data Science and Machine Learning Essentials

Introduction to Python

Interactive Mode Python Pylab

Sampling random numbers from a uniform p.d.f.

Prob_and_RV_Demo. August 21, 2018

Introduction to Scientific Python, CME 193 Jan. 9, web.stanford.edu/~ermartin/teaching/cme193-winter15

ipywidgets_demo July 17, Interactive widgets for the Jupyter notebook (ipywidgets)

Using the Matplotlib Library in Python 3

CHAPTER 6. The Normal Probability Distribution

NAVIGATING UNIX. Other useful commands, with more extensive documentation, are

Command Line and Python Introduction. Jennifer Helsby, Eric Potash Computation for Public Policy Lecture 2: January 7, 2016

CS 237 Lab Nine: Displaying Multivariate Data

Manual_implementation_of_the_Mersenne_twister_PseudoRandom_N

Advanced Python on Abel. Dmytro Karpenko Research Infrastructure Services group Department for Scientific Computing USIT, UiO

HW0 v3. October 2, CSE 252A Computer Vision I Fall Assignment 0

Computational Physics Programming Style and Practices & Visualizing Data via Plotting

Plotting With matplotlib

CME 193: Introduction to Scientific Python Lecture 5: Numpy, Scipy, Matplotlib

MATPLOTLIB. Python for computational science November 2012 CINECA.

Visualisation in python (with Matplotlib)

The SciPy Stack. Jay Summet

solving polynomial systems in the cloud with phc

Statistical Methods for NLP LT 2202

CSC 1315! Data Science

MAS212 Scientific Computing and Simulation

Matplotlib Python Plotting

Python in Economics and Finance

Intro to Research Computing with Python: Visualization

Data Science Essentials

Interpolation and curve fitting

Topic 5 - Joint distributions and the CLT

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it.

COMP 364: Computer Tools for Life Sciences

Lab 1 - Basic ipython Tutorial (EE 126 Fall 2014)

cosmos_python_ Python as calculator May 31, 2018

#To import the whole library under a different name, so you can type "diff_name.f unc_name" import numpy as np import matplotlib.

Week Two. Arrays, packages, and writing programs

Overlapping on/off sources

L15. 1 Lecture 15: Data Visualization. July 10, Overview and Objectives. 1.2 Part 1: Introduction to matplotlib

The Python interpreter

Effective Programming Practices for Economists. 10. Some scientific tools for Python

CME 193: Introduction to Scientific Python Lecture 6: Numpy, Scipy, Matplotlib

(2) Hypothesis Testing

Parallel Computing with ipyparallel

Introduction to Programming

Scientific computing platforms at PGI / JCNS

Python for Scientists

AUTHORS: FERNANDO PEREZ BRIAN E GRANGER (IEEE 2007) PRESENTED BY: RASHMISNATA ACHARYYA

PS6-DCT-Soln-correction

Episode 8 Matplotlib, SciPy, and Pandas. We will start with Matplotlib. The following code makes a sample plot.

Sampling from distributions

Numerical Calculations

Probability Models.S4 Simulating Random Variables

Python 101. Nadia Blagorodnova Caltech, 25th January 2017

Using jupyter notebooks on Blue Waters. Roland Haas (NCSA / University of Illinois)

Prof. Dr. Rudolf Mathar, Dr. Arash Behboodi, Emilio Balda. Exercise 5. Friday, December 22, 2017

5 File I/O, Plotting with Matplotlib

Image Processing (1) Basic Concepts and Introduction of OpenCV

Plotting with an introduction to functions

Computational Physics Adaptive, Multi-Dimensional, & Monte Carlo Integration Feb 21, 2019

INF : NumPy and SciPy

Candy is Dandy Project (Project #12)

MO101: Python for Engineering Vladimir Paun ENSTA ParisTech

Introduction to Python for Scientific Computing

Data Science and Machine Learning Essentials

Mathematical Programming

INTRODUCTION TO DATA VISUALIZATION WITH PYTHON. Working with 2D arrays


KNIME Python Integration Installation Guide. KNIME AG, Zurich, Switzerland Version 3.7 (last updated on )

COSC 490 Computational Topology

IPython Cypher Documentation

Frequency Distributions

Pandas plotting capabilities

A New Cyberinfrastructure for the Natural Hazards Community

Homework 11 - Debugging

Introduction to Matplotlib: 3D Plotting and Animations

Lab 16 - Multiclass SVMs and Applications to Real Data in Python

Anaconda Python Guide On Windows Github Pages

Latent Semantic Analysis. sci-kit learn. Vectorizing text. Document-term matrix

Scientific Computing: Lecture 1

tutorial : modeling synaptic plasticity

AMath 483/583 Lecture 28 June 1, Notes: Notes: Python scripting for Fortran codes. Python scripting for Fortran codes.

Central Limit Theorem Sample Means

Data Science and Machine Learning Essentials

Scientific Python. 1 of 10 23/11/ :00

Kernel Density Estimation (KDE)

Practical 06: Plotting and the Verlet integrator Documentation

PHY Introduction to Python Programming, week 5

Will Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions

Transcription:

PHYS 310 HW Problem Simulation of PHYS 211 M&M Experiment 6 Colors: Yellow, Blue, Orange, Red, Green, and Blue Assume 60 M&Ms in every bag Assume equal probabilities (well mixed, large "reservoir") Assume 24 students (bags) per section NOTE: In this notebook I use the module scipy.stats for all statistics functions, including generation of random numbers. There are other modules with some overlapping functionality, e.g., the regular python random module, and the scipy.random module, but I do not use them here. The scipy.stats module includes tools for a large number of distributions, it includes a large and growing set of statistical functions, and there is a unified class structure. (And namespace issues are minimized.) See https://docs.scipy.org/doc/scipy/reference/stats.html (https://docs.scipy.org/doc/scipy/reference/stats.html). In [1]: import scipy as sp from scipy import stats import matplotlib as mpl # As of July 2017 Bucknell computers use v. 2.x import matplotlib.pyplot as plt # Following is an Ipython magic command that puts figures in the notebook. # For figures in separate windows, comment out following line and uncomment # the next line # Must come before defaults are changed. %matplotlib notebook #%matplotlib # As of Aug. 2017 reverting to 1.x defaults. # In 2.x text.ustex requires dvipng, texlive-latex-extra, and texlive-fonts-recommended # which don't seem to be universal # See https://stackoverflow.com/questions/38906356/error-running-matplotlib-in-latex-ty mpl.style.use('classic') # M.L. modifications of matplotlib defaults using syntax of v.2.0 # More info at http://matplotlib.org/2.0.0/users/deflt_style_changes.html # Changes can also be put in matplotlibrc file, or effected using mpl.rcparams[] plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures plt.rc('axes', labelsize=16, titlesize=14) plt.rc('figure', autolayout = True) # Adjusts supblot parameters for new si Part 1 To get started, sample one bag of M&Ms, and count the numberof brown M&Ms. Do this by generating 60 random integers from the set 0,1,2,3,4,5, and let's say that "brown" = 5. In [2]: bag = sp.stats.randint.rvs(0,6,size = 60) # or sp.random.randint(0,6,60) bag Out[2]: array([1, 1, 1, 3, 2, 0, 4, 5, 4, 3, 1, 3, 5, 0, 5, 2, 5, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 3, 1, 2, 2, 1, 2, 5, 5, 3, 4, 1, 3, 4, 3, 0, 2, 0, 4, 1, 0, 2, 0, 1, 1, 5, 4, 1, 0, 4, 1, 1, 0, 0]) http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 1/6

Count the number of each color in the bag using sp.bincount(bag). The first element in the array is the number of occurences of 0 in "bag," the second element is the number of occurences of 1, etc. In [3]: sp.bincount(bag) Out[3]: array([ 9, 13, 9, 13, 9, 7]) For our "brown" = 5 choice, the number of brown M&Ms is the last element in the array returned by bincount, or sp.bincount(bag)[5]. In [4]: sp.bincount(bag)[5] Out[4]: 7 Now sample many bags Record number of brown M&Ms in each bag In [5]: # Long version of sampling many bags nb = 24 # number of bags data_section = sp.zeros(nb) # array in which to store data for a lab section for i in range(nb): bag = sp.stats.randint.rvs(0,6,size=60) data_section[i] = sp.bincount(bag)[5] In [6]: # Concise version of sampling many bags nb = 24 # number of bags data_section = sp.array([sp.bincount(sp.stats.randint.rvs(0,6,size=60))[5] for i in ran data_section Out[6]: array([14, 13, 10, 16, 13, 11, 11, 7, 7, 10, 11, 13, 13, 6, 17, 5, 9, 9, 12, 8, 6, 15, 13, 10]) In [7]: sp.mean(data_section), sp.std(data_section), sp.std(data_section)/sp.sqrt(len(data_sect Out[7]: (10.791666666666666, 3.1882488227168775, 0.65079856572379458) Answer for Part 1, the results from a single lab section: N section = 9.9 ± 0.6 http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 2/6

In [8]: plt.figure(1) nbins = 20 low = 0 high = 20 plt.hist(data_section,nbins,[low,high]) plt.xlim(0,20) plt.title("histogram of brown M&Ms per bag",fontsize=14) plt.xlabel("number of brown M&Ms") plt.ylabel("occurences") plt.show() Figure 1 Part 2 Now we simulate data from 200 lab sections. In [9]: nb = 24 # Number of bags in a section ns = 200 # Number of sections in class data_class = sp.zeros(ns) # array for results from each section for j in range(ns): data_section = sp.zeros(nb) # array in which to store section data for i in range(nb): bag = sp.stats.randint.rvs(0,6,size=60) data_section[i] = sp.bincount(bag)[5] data_class[j] = sp.mean(data_section) In [10]: sp.mean(data_class), sp.std(data_class) Out[10]: (9.9508333333333336, 0.58757032676003029) The standard deviation of the section averages (0.62) is consistent with that predicted by the standard deviation of the mean determined from a single section (0.6). http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 3/6

For a more through comparision with the predictions of the central limit theorem (CLT) I will compare the histogram of section averages with the the normal distribution predicted by the CLT. The average of the normal distribution is predicted to be 10, and the standard deviation is predicted to be σ parent / N, where is the number of bags in a section. The parent distribution is a binomial distribution with n = 60, and p = 1/6. N In [11]: sigmap = sp.stats.binom.std(60,1/6.) sigmap Out[11]: 2.8867513459481291 This result is consistent with the observed experimental standard deviation from a single section. When comparing the histogram with the normal distribution, we are really comparing the observed occurences in a bin with an integration of the probability distribution over finite interval: x2 N section p(x) dx N section p( x )( x 2 x 1 ) x1 where ( x 2 x 1 ) is the binwidth. (For wide bins it would be better to use the cdf rather than this approximation.) http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 4/6

In [12]: plt.figure(2) nbins = 25 low = 5 high = 15 binwidth = (high-low)/nbins plt.hist(data_class,nbins,[low,high],alpha=0.5) plt.xlim(low,high) x = sp.linspace(0,20,400) y = sp.stats.norm.pdf(x,10,sigmap/sp.sqrt(nb))*ns*binwidth plt.plot(x,y) plt.title("histogram of section averages",fontsize=14) plt.xlabel("number of brown M&Ms") plt.ylabel("occurences") plt.show() Figure 2 The histogram compares well with the preditions of the CLT. (For a more quantitative comparison, see Chapter 8 of Hughes and Hase.) Even though the parent distribution is binomial, the CLT says that the distribution of AVERAGES drawn from the parent will be gaussian. Version information version_information is from J.R. Johansson (jrjohansson at gmail.com) See Introduction to scientific computing with Python: http://nbviewer.jupyter.org/github/jrjohansson/scientific-python-lectures/blob/master/lecture-0-scientific- Computing-with-Python.ipynb (http://nbviewer.jupyter.org/github/jrjohansson/scientific-pythonlectures/blob/master/lecture-0-scientific-computing-with-python.ipynb) for more information and instructions for package installation. If version_information has been installed system wide (as it has been on Bucknell linux computers with shared file systems), continue with next cell as written. If not, comment out top line in next cell and uncomment the second line. http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 5/6

In [13]: %load_ext version_information #%install_ext http://raw.github.com/jrjohansson/version_information/master/version_info Loading extensions from ~/.ipython/extensions is deprecated. We recommend managing ex tensions like any other Python packages, in site-packages. In [14]: version_information scipy, matplotlib Out[14]: Software Version Python 3.6.1 64bit [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] IPython 6.1.0 OS Linux 3.10.0 327.36.3.el7.x86_64 x86_64 with redhat 7.2 Maipo scipy 0.19.1 matplotlib 2.0.2 Tue Aug 01 11:11:10 2017 EDT In [ ]: http://localhost:8888/notebooks/python/jupyter/juptyer_examples/mandm.ipynb 6/6