A practical guide to computer simulation II

Similar documents
Diode Lab vs Lab 0. You looked at the residuals of the fit, and they probably looked like random noise.

Experimental Physics I & II "Junior Lab"

An Introduction to Gnuplot for Chemists. By Nicholas Fitzkee Mississippi State University Updated September 1, 2016

One Factor Experiments

Model Fitting. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods -

Data Graphics with Gnuplot

Scientific Graphing in Excel 2007

1 Gnuplot. Installing and running. Plot

Introduction to Fitting ASCII Data with Errors: Single Component Source Models

Scientific Graphing in Excel 2013

Numerical methods in physics

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

Boltzmann Statistics REDOR (BS-REDOR) Data Analysis Guide

Statistical Matching using Fractional Imputation

Introduction to Languages for Scientific Computing, winter semester 14/15: Final Exam

Exam Issued: May 29, 2017, 13:00 Hand in: May 29, 2017, 16:00

Integration. Volume Estimation

On the Design of an On-line Complex Matrix Inversion Unit

PHYSICS 115/242 Homework 5, Solutions X = and the mean and variance of X are N times the mean and variance of 12/N y, so

KaleidaGraph Quick Start Guide

Markov chain Monte Carlo methods

Using Matlab for Curve Fitting in Junior Lab

Curve fitting: Nighttime ozone concentration

3D Motion from Image Derivatives Using the Least Trimmed Square Regression

Critical Phenomena, Divergences, and Critical Slowing Down

TIME-FREQUENCY SPECTRA OF MUSIC

Cluster Newton Method for Sampling Multiple Solutions of an Underdetermined Inverse Problem: Parameter Identification for Pharmacokinetics

RADIOGRAPHIC LEAST SQUARES FITTING TECHNIQUE ACCURATELY

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

Generating random samples from user-defined distributions

Edge and local feature detection - 2. Importance of edge detection in computer vision

2. Linear Regression and Gradient Descent

Note Set 4: Finite Mixture Models and the EM Algorithm

Administrivia. Next Monday is Thanksgiving holiday. Tuesday and Wednesday the lab will be open for make-up labs. Lecture as usual on Thursday.

Computational Approach to Materials Science and Engineering

Stochastic Simulation: Algorithms and Analysis

Minicourse II Symplectic Monte Carlo Methods for Random Equilateral Polygons

Lecture 5: Dealing with Data. Bálint Joó Scientific Computing Group Jefferson Lab

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University

COPYRIGHTED MATERIAL CONTENTS

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support

A Quick Guide to Gnuplot. Andrea Mignone Physics Department, University of Torino AA

Bootstrapping Methods

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Figure 1. Figure 2. The BOOTSTRAP

Formula for the asymmetric diffraction peak profiles based on double Soller slit geometry

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Basic statistical operations

arxiv:hep-ex/ v1 24 Jun 1994

1 Methods for Posterior Simulation

An introduction to plotting data

Chapter 7: Computation of the Camera Matrix P

Week 4: Simple Linear Regression II

Department of Computer Science, University of Otago

Dealing with Categorical Data Types in a Designed Experiment

Convexization in Markov Chain Monte Carlo

Time Series Analysis by State Space Methods

General Factorial Models

Here is the probability distribution of the errors (i.e. the magnitude of the errorbars):

SIMULATED ANNEALING TECHNIQUES AND OVERVIEW. Daniel Kitchener Young Scholars Program Florida State University Tallahassee, Florida, USA

0.1. alpha(n): learning rate

Theoretical Concepts of Machine Learning

STEPHEN WOLFRAM MATHEMATICADO. Fourth Edition WOLFRAM MEDIA CAMBRIDGE UNIVERSITY PRESS

Simplicial Complexes of Networks and Their Statistical Properties

Comprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 2001 >>> SOLUTIONS <<<

OpenMP - exercises - Paride Dagna. May 2016

CpSc 1010, Fall 2014 Lab 10: Command-Line Parameters (Week of 10/27/2014)

Welcome to Microsoft Excel 2013 p. 1 Customizing the QAT p. 5 Customizing the Ribbon Control p. 6 The Worksheet p. 6 Excel 2013 Specifications and

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Exercise 1 - Linear Least Squares

QUESTIONS 1 10 MAY BE DONE WITH A CALCULATOR QUESTIONS ARE TO BE DONE WITHOUT A CALCULATOR. Name

Voluntary State Curriculum Algebra II

Graphical Analysis with Gnuplot. Evangelos Pournaras, Izabela Moise

Estimating the Information Rate of Noisy Two-Dimensional Constrained Channels

Direction Fields; Euler s Method

Solar Radiation Data Modeling with a Novel Surface Fitting Approach

An Adaptive Importance Sampling Technique

Scientific Python: matplotlib

Applied Statistics : Practical 9

Regression Accuracy and Power Simulations

Lecture 6: Parallel Matrix Algorithms (part 3)

Biostatistics 615/815 Lecture 16: Importance sampling Single dimensional optimization

The Bootstrap and Jackknife

Monte Carlo Methods. Named by Stanislaw Ulam and Nicholas Metropolis in 1940s for the famous gambling casino in Monaco.

SciGraphica. Tutorial Manual - Tutorials 1and 2 Version 0.8.0

Overview of various smoothers

Goals for This Lecture:

Multivariate Analysis Multivariate Calibration part 2

Overview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week

Recursive Definitions

P445/515 Data Analysis using PAW

Supplementary Material

ADAPTIVE METROPOLIS-HASTINGS SAMPLING, OR MONTE CARLO KERNEL ESTIMATION

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S

Section 4 Matching Estimator

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

Chapter 9 Introduction to Arrays. Fundamentals of Java

Laboratory #11. Bootstrap Estimates

Lecture (03) Binary Codes Registers and Logic Gates

Transcription:

A practical guide to computer simulation II Alexander K. Hartmann, University of Göttingen June 18, 2003 7 Data analysis 7.1 Data plotting Two programs discussed here: gnuplot and xmgrace. gnuplot xmgrace user interface terminal, output windows based to extra window speed of use fast medium scripts good not so good data fitting very good medium output quality medium good flexibility medium very good 7.1.1 Using gnuplot Invoke: Call gnuplot in a shell. A prompt gnuplot > appear, commands can be entered. Plotting a x-y-dy file, e.g. sg e0 L.dat (Ground state energy of a spin glass as a function of system size). File format: 1 point per line, comments (starting with #) are ignored, e.g.: # ground state energy of +-J spin glasses # L e_0 error 3-1.6710 0.0037 4-1.7341 0.0019... 14-1.7866 0.0007 gnuplot> plot "sg_e0_l.dat" with yerrorbars (short: ) p "sg_e0_l.dat" w e Window with plot pops up. Other styles possible, e.g. with lines. Plotting multicolumn files, e.g. gnuplot> plot "test.dat" using 1:4:5 w e 1

More information: enter help plot. 3d plots: splot. Setting labels: gnuplot> set xlabel "L" and issue plot command again, or use replot. To write to postscript file: gnuplot> set terminal postscript gnuplot> set output "test.eps" and plot again. Scripts = ASCII files (e.g. commands.gp) with gnuplot commands. Can e.g. be generated automatically by other scripts. Invoke gnuplot command.gp from a shell. 7.1.2 Using xmgrace Call xmgrace and a window will pop up. Commands: via menus. Loading data to display: use menu Data Import ASCII... Choose single set or block data (where some columns among many are selected). Styles, lines, symbols, sizes, error bars, colors, legend strings etc can be adjusted all and independently of each others. Use menu Plot Set appearance. For strings like legends, axis labels etc there are some special formatting sequences, e.g.: \x greek font \1 italic font \4 sans serif font \S change to superscript \N change back to normal script \s subscript To change position, style, size etc of legend: menu Plot Graph appearance To change axis labels, tick marks, sizes, etc: use menu Plot Axis properties. To combine several plots into one (e.g. insets): menu Edit arrange graphs. To print: Menu File Print setup.

7.2 Fitting data Fitting = match parametrized function to data points via changing parameters. Here: Using gnuplot: Example: fit function f(l) = e +al b to spin-glass energy data. Define function + starting values: gnuplot> f(x)=e+a*x**b gnuplot> e=-1.8 gnuplot> a=1 gnuplot> b=-1 Fit used Marquardt-Levenberg algorithm [2] algorithm. Command fit. One has to provide function, data set and state the parameters with are variable. E.g.: gnuplot> fit f(x) "sg_e0_l.dat" via e,a,b Gnuplot writes log informations (not shown) and result: After 17 iterations the fit converged. final sum of squares of residuals : 7.55104e-06 rel. change during last iteration : -2.42172e-09 degrees of freedom (ndf) : 5 rms of residuals (stdfit) = sqrt(wssr/ndf) : 0.00122891 variance of residuals (reduced chisquare) = WSSR/ndf : 1.51021e-06 Final set of parameters Asymptotic Standard Error ======================= ========================== e = -1.78786 +/- 0.0008548 (0.04781%) a = 2.5425 +/- 0.2282 (8.976%) b = -2.80103 +/- 0.08265 (2.951%) correlation matrix of the fit parameters: e a b e 1.000 a 0.708 1.000 b -0.766-0.991 1.000 Estimate quality of fit: 3 lines after degree of freedom. Degrees of freedom = number of data points minus the number of parameters in the fit. WSSR means the deviation of the fit function f(x) from the data points (x i, y i ± σ i ) (i = 1,..., N) is given by χ 2 = [ ] N yi f(x i ) 2. i=1 σ i Calculate quality Q = probability that the value of χ 2 is worse than in the current fit, given the assumption that the data points y i are Gaussian distributed with mean f(x i ) and variance one [2]. The larger the value of Q, the better is the quality of the fit. To calculate Q you can use the little program Q.c

#include <stdio.h> #include "nr.h" int main(int argc, char **argv) { float ndf, chi2_per_df; sscanf(argv[1], "%f", &ndf); sscanf(argv[2], "%f", &chi2_per_df); printf("q=%e\n", gammq(0.5*ndf, 0.5*ndf*chi2_per_df)); return(0); } uses gammaq from Numerical Recipes [2]. Program is called in the form Q <ndf> <WSSR/ndf>. Note 1: Convergence depends on choice of initial parameters, algorithm may be trapped in local minimum (e.g. try e=1,a=-3,b=1) Note 2: In the example above all values entered with the same weight. To include also error bars: fit f(x) "sg e0 L.dat" using 1:2:3 via a,b,c xmgrace: Menu Data Transformations Non-linear curve fitting. Enter formula y = a0+a1*x**a2, choose parameters = 3 and enter starting values a0=-1.8, a1=1, a2=-1. Select a set (to which function is fitted) and hit on Apply. Result: in pop-up window + fitting curve as a new set. Note: NO error bars for resulting parameters. 7.3 Finite-size scaling Statistical physics: thermodynamic limit (N ). In simulations: only small finite systems. Solution: finite-size scaling = study several small system sizes. Example: spin glass H = i,j J ijs i S j with fraction p of antiferromagnetic J ij = 1 and fraction (1 p) of ferromagnetic J ij = 1 bonds. Calculation of ground states + evaluation of magnetisation m = i S i/n as a function of p. We look for transition ferromagnet (m > 0, p < p c small) to spin glass (m = 0, p > p c large.). Results for different system sizes: Theory of finite-size scaling [3]: m(p, L) = L β/ν m(l 1/ν (p p c )) (1) m is a universal, i.e. non size-dependent, function. β, ν = critical exponents describing the algebraic behavior of the magnetisation and correlation length near p c. Obtain p c, ν, β from plotting L β/ν m(p, L) against L 1/ν (p p c ). If parameters are correct data collapse. Using gnuplot: input file m_scale.dat with three columns L, p, m(p, L), then: gnuplot> b=0.27 gnuplot> n=1.1 gnuplot> pc=0.222 gnuplot> plot [-1:1] "m_scale.dat" u (($2-pc)*$1**(1/n)):($3*$1**(b/n))

1 0.8 m(p,l) 0.6 0.4 L=3 L=5 L=14 0.2 0 0.1 0.15 0.2 0.25 0.3 p Figure 1: Average ground-state magnetisation m of a three-dimensional ±J spin glass with fractions p of antiferromagnetic bonds. Lines are guides to the eyes only. Result: Normally: Parameters are not know. Start with reasonable assumptions and change iteratively until good data collapse obtained. Hint: more convenient is fsscale [4]. One can change the scaling exponents interactively via keys. 7.4 Resampling Given N uncorrelated random number x i. Repetition: Estimator for average x : Estimator for error bar (65%): σ x = x = 1 N N x i (2) i=1 [x 2 x 2 ]/(N 1) (3) Problem: combined measures g(x 1,..., x N ) like Binder parameters How to calculate error bars? Solution: Jackknife method [1]. Define i.e. the average omitting data point i. One can show that [g] J 1 N g = 0.5(3 x4 x 2 ) (4) g J i = g(x 1,..., x i 1, x i+1,..., x N ) (5) gi J g (6) i

Figure 2: Gnuplot output of a finite-size scaling plot. The ground-state magnetisation of a three-dimensional ±J spin glass as a function of the concentration p of the antiferromagnetic bonds is shown. For the fit, the parameters p c = 0.222, β = 0.27 and ν = 1.1 have been used. Estimator for error bar is: σ g J = (N 1) ([g 2 ] J [g] 2 J ) (7) Remark: It is usually sufficient to divide the data into K blocks and omit each time one of the blocks. 8 Exercise Use the program for ballistic deposition to record the average (1000 realizations) roughness W (L, t) for different system sizes (L = 10, 20, 40, 100) as a function of time (up to 100 sweeps). One expects: W t β for small times t < t X (L) and W L α for large times t < t x (L). This combines to W (L, t) L α f(t/l z ) ( ) with z = α/β. First plot W (t) for different L. Then use gnuplot to obtain β from fitting W (t) for different sizes and make a scaling plot for ( ). Hint: fit [10:100] f(x) "data.dat"... fits the data only in the range [10 : 100]. [Result: from fit β around 0.25 (true: β = 1/3), good collapse for z = 1.2, α = 0.37 β = α/z = 0.31] References [1] B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans, SIAM 1982

[2] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C (Cambridge University Press, Cambridge 1995) [3] K. Binder and D.W. Heermann, Monte Carlo Simulations in Statistical Physics, (Springer, Heidelberg 1988) [4] The program fsscale is written by A. Hucht, see http://www.thp.uniduisburg.de/fsscale/