IST 3108 Data Analysis and Graphics Using R Week 9 Engin YILDIZTEPE, Ph.D 2017-Spring Introduction to Graphics >y <- rnorm(20) >plot (y) In R, pictures are presented in the active graphical device or window. When a graphical function is executed, R will open such a window. You can print directly from the graphics window, or choose to copy the graph to the clipboard.you can also resize the graph. A graph can also be saved in many other formats, pdf, bitmap, metafile, jpeg, postscript, png or TIFF. (File Save as (when the graphics window is active)) 2 1
Introduction to Graphics When you produce a new plot, the old one is lost. In MS Windows, you can save a history of your graphs by activating the Recording feature under the History menu (seen when the graphics window is active). You can access old graphs by using the Page Up and Page Down keys. You can simply open a new active graphics window by using the function windows() in Windows, X11() in Unix and quartz() on a Mac OS X. 3 The traditional graphics model - plot If a single vector object is given the values are plotted on the y-axis against the row numbers or index. >y <- rnorm(20) >plot (y) The type= argument controls the type of plot produced, as follows: plot(y, type="p") plot(y, type="l") plot(y, type="b") plot(y, type="h") plot(y, type= s") plot(y, type= n") 4 2
The traditional graphics model - plot The plot types are: "p" points (i.e. a scatterplot) "l" lines (i.e. a line plot) "b" both (points and lines) "o" points and lines overplotted "h" high-density needles "s" step function, horizontal step first "n" nothing (i.e. no plot contents) 5 The traditional graphics model plot types 6 3
The traditional graphics model plot types 7 Standard arguments > y <- rnorm(20) > plot(y, type="l", lwd=3) > plot(y, type="b", lwd=5) > plot(y, type="l", col="grey") > plot(y, type="l", col="red", lwd=5) > plot(y, type="l", lty="dashed") > plot(y, type="l", ylim=c(-4, 4)) > plot(y, type="l", ylim=c(-3, 3)) > plot(y, type="l", ylim=c(-3, 3), xlim=c(0,50)) 8 4
The traditional graphics model plot line types Line types can either be specified as an integer; 0 :blank, 1 :solid (default), 2 :dashed, 3 :dotted, 4 :dotdash, 5 :longdash, 6 :twodash or as one of the character strings "blank", "solid", "dashed", "dotted", "dotdash", "longdash 9 Plots of one or two variables If x and y are vectors, plot(x, y) produces a scatterplot of y against x Example: Plot of sin(x) y -1.0-0.5 0.0 0.5 1.0 2 4 6 8 10 x 10 5
Plots of one or two variables Example; > x<-seq(1,10,length=100) > y<-sin(x) > plot(x, y, main = "Plot of sin(x)", type="l, col="blue", lwd=3) 11 Plots of one or two variables Using pressure dataset?pressure plot(pressure$temperature, pressure$pressure) plot(pressure ~ temperature, data=pressure) plot(pressure ~ temperature, data=pressure, type='l') 12 6
Exercise >plot (pressure$temperature, pressure$pressure, xlab = "Temperature (deg C)", ylab = "Pressure (mm of Hg)", main = "pressure data: Vapor Pressure of Mercury") > plot(mtcars$wt, mtcars$mpg, ylab = "miles per gallon", xlab = "Weight (lb/1000)", main="mtcars data: fuel consumption and weight", type="p", col="red", lwd=3) 13 Permanent changes: The par() function The par() function can be used to modify traditional graphics state settings by specifying a value via an argument with the appropriate setting name. The following code sets new values for the col and lty settings. >par(col="red", lty="dashed") >plot(y, type="l") # line is red and dashed y -1.0-0.5 0.0 0.5 1.0 0 20 40 60 80 100 Index 14 7
Permanent changes: The par() function >par(mfrow = c(2, 2)) >plot(pressure ~ temperature, data=pressure) >plot(pressure ~ temperature, data=pressure, type="l") >plot(pressure ~ temperature, data=pressure, type="l", lty=1) >plot(pressure ~ temperature, data=pressure, type="b", lty=1) pressure 0 600 0 600 0 50 100 200 300 temperature 0 50 100 200 300 temperature pressure 0 600 pressure 0 600 pressure 0 50 100 200 300 temperature 0 50 100 200 300 temperature 15 Permanent changes: The par() function >par(mfrow=c(1,1), bg = "cornsilk") >plot(pressure ~ temperature, data=pressure) pressure 0 200 400 600 800 0 50 100 150 200 250 300 350 temperature 16 8
The par () function - Example Std. Normal t df=1 t df=5 Density 0.0 0.2 0.4-4 -2 0 2 4 dt(x, 1) 0.05 0.20-4 -2 0 2 4 dt(x, 5) 0.0 0.2-4 -2 0 2 4 x value x x t df=10 t df=30 dt(x, 10) 0.0 0.2 0.4-4 -2 0 2 4 dt(x, 30) 0.0 0.2 0.4-4 -2 0 2 4 x x 17 The par () function - Example x <- seq(-4, 4, length=100) par(mfrow = c(2, 3), lty=1, lwd=3) plot(x, dnorm(x), type="l", xlab="x value",ylab="density", main="std. Normal") plot(x, dt(x,1), type="l", main="t df=1", col="red") plot(x, dt(x,5), type="l", main="t df=5", col="blue") plot(x, dt(x,10), type="l", main="t df=10", col="darkgreen") plot(x, dt(x,30), type="l", main="t df=30", col="gold") 18 9
The lines() function - Example Comparison of t Distributions Density 0.0 0.1 0.2 0.3 0.4 Distributions df=1 df=5 df=10 df=30 norma -4-2 0 2 4 x value 19 The lines() function - Example x <- seq(-4, 4, length=100) px <- dnorm(x) degf <- c(1, 5, 10, 30) colors <- c("red", "blue", "darkgreen", "gold,"black") labels <- c("df=1,"df=5,"df=10", "df=30,"normal") par(mfrow=c(1,1),lwd=2, lty=1, col="black") plot(x, px, type="l", lty=2, lwd=4,xlab= x value", ylab="density", main="comparison of t Distributions") 20 10
The lines() function - Example lines(x, dt(x,degf[1]), col=colors[1]) lines(x, dt(x,degf[2]), col=colors[2]) lines(x, dt(x,degf[3]), col=colors[3]) lines(x, dt(x,degf[4]), col=colors[4]) legend("topright", legend=labels,title="distributions", lwd=2, lty=c(1, 1, 1, 1, 2), col=colors) 21 boxplot Produce box-and-whisker plot(s) of the given (grouped) values. >boxplot(mtcars$mpg) >boxplot(mtcars$mpg,notch=t) >boxplot(mtcars$mpg~mtcars$cyl) >boxplot(mtcars$mpg~mtcars$cyl,main="milage by cyl", col=c("red", "blue", "gold")) 22 11
boxplot >boxplot(mtcars$mpg~mtcars$cyl,main="milage by cyl", xlab="milage", horizontal=true,col=c("red", "blue", "gold")) >boxplot(mtcars$mpg~mtcars$cyl,main="milage by Cyl",xlab="Milage", horizontal=true,col=terrain.colors(3)) >boxplot(mtcars$mpg~mtcars$cyl,main="milage by Cyl",xlab="Milage", horizontal=true,col=rainbow(3)) 23 Histogram The generic function hist computes a histogram of the given data values. y<-rnorm(1000) hist(y) Histogram of y Frequency 0 50 100-3 -2-1 0 1 2 3 y 24 12
> hist(y,breaks="scott") Histogram > hist(y,breaks="fd") > hist(y,breaks="sturges") 25 Histogram > hist(y,breaks=seq(-4,4,0.1)) Histogram of y Frequency 0 10 30-4 -2 0 2 4 y 26 13
Histogram > hist(y,breaks=50,col="tomato") Histogram of y Frequency 0 10 30-3 -2-1 0 1 2 3 y 27 Histogram > hist(y,breaks=20,col=colors()) Histogram of y Frequency 0 50 100-3 -2-1 0 1 2 3 y 28 14
Histogram y için histogram Frekanslar 0 50 100 150 200-4 -2 0 2 4 y değerleri 29 Histogram 30 15
pie Draw a pie chart. >x<-c(14, 34, 20, 10, 22,40) >names(x)<-c( A, B, C, D, E, F ) >pie(x,main = "Energy Consumption", col=rainbow(6)) Energy Consumption B D C A E F 31 pie >pie(x,main = "Energy Consumption", col=rainbow(6),init.angle=110,clockwise=t) Energy Consumption A F B E D C 32 16
barplot Creates a bar plot with vertical or horizontal bars >barplot(x, ylim = c(0, 40),col = rainbow(6), main = "Energy Consumption, ylab = "Percent in Category") Energy Consumption Percent in Category 0 10 20 30 40 A B C D E F 33 barplot >barplot(x, xlim = c(0, 40),col = rainbow(6), main = "Energy Consumption", ylab = "Percent in Category", horiz=t) Energy Consumption Percent in Category A B C D E F 0 10 20 30 40 34 17
barplot >barplot(x, ylim = c(0, 40), xlim=c(0,8), col=rainbow(6), main="energy Consumption", ylab="percent in Category", axis.lty=1, legend=names(x)) Energy Consumption Percent in Category 0 10 20 30 40 A B C D E F A B C D E F 35 example How can we draw this pie chart? 36 18
curve function Draws a curve corresponding to a function over the interval [from, to]. curve(expr, from = NULL, to = NULL, ) >curve(dnorm(x,m=10,sd=2), from=0, to=20, main= "Normal distribution") Normal distribution dnorm(x, m = 10, sd = 2) 0.00 0.10 0.20 0 5 10 15 20 x 37 curve function >curve(dgamma(x, scale=1.5, shape=2),from=0, to=15, main="gamma distribution") 38 19
curve function >curve(dweibull(x, scale=2.5, shape=1.5),from=0, to=15, main="weibull distribution") 39 qqnorm - qqline This plot is used to determine if your data is close to being normally distributed. You cannot be sure that the data is normally distributed, but you can rule out if it is not normally distributed. y<-rnorm(1000,mean=10,sd=2) qqnorm(y) Normal Q-Q Plot qqline(y,col=2) Sample Quantiles 4 6 8 12 16-3 -2-1 0 1 2 3 Theoretical Quantiles 40 20
qqnorm - qqline x.wei<-rweibull(n=200,shape=2.2,scale=1.2) qqnorm(x.wei) qqline(x.wei,col=2) Normal Q-Q Plot Sample Quantiles 0.0 1.0 2.0-3 -2-1 0 1 2 3 Theoretical Quantiles 41 21