Experiment 1 CH 222 - Fall 2004 INTRODUCTION TO SPREADSHEETS Introduction Spreadsheets are valuable tools utilized in a variety of fields. They can be used for tasks as simple as adding or subtracting two numbers to those as complicated as keeping the records for an entire company or modeling the results of a complicated chemistry experiment. In science spreadsheets are used in molecular biology, aquatic chemistry, chemical engineering, and soil metabolism. In business they are used by auditors, accountants, and small business owners. These inexpensive powerful easy-to-use software packages bring data manipulation and graphics to everyone with a personal computer. The goal of this lab is to enable the student to become proficient enough to feel comfortable using a spreadsheet for data analysis and graphics. Whether you choose to use Excel, Quattro Pro, or 1-2-3 the information contained in this tutorial should be relevant because the main difference between software packages is the menu heading under which a specific command is located. Two terms which will be used extensively in this manuscript are Click on (click using the left mouse button) and Right Click (click using the right mouse button). Excel is the spreadsheet you will use today so double click on the Excel icon. (If you have had some experience with spreadsheets you may skip this section and go on to part B.) A - Spreadsheet Fundamentals Spreadsheets are nothing more than a large array of columns (A, B, C, D,...) and rows (1, 2, 3, 4,...). They are the electronic equivalent of a paper ledger. We will define each column-row position as a cell with the column-row as its name as in A2 or D5. Three types of data can be inserted into a cell Labels can be numbers or text and are often located at the top of a column. Values are numbers. The formula is the most powerful of the data types and can include values and/or cell names. The formula instructs the spreadsheet (and hence the computer) to do a mathematical calculation and display the results in the cell.
Let s try this with a simple formula: enter into any cell =1 + 2. (Excel requires an operator to begin a formula and this is often an equal, +, or sign.) The cell you chose should now contain the value 3, but if this were the only way a spreadsheet cell could calculate it would be very tedious for large equations. Enter the values 5, 30, 99 in 3 cells in a column, maybe D7, D8, and D9. What makes the spreadsheet powerful is the ability to reference cells in a formula. In a cell below enter =D7+D8+D9. This will return 134 if the numbers have been entered correctly. Equation =D7+D8+D9 can also be entered with less typing. One nice aspect of modern spreadsheets is that cell references (D7, D8, D9,...) do not need to be typed. Instead pointing and clicking serves the same function. In a cell below the last enter = then click on cell D7, enter +, click on cell D8, enter + and click on cell D9, and press enter. This also returns 134. By entering a formula in this fashion you can easily reference distant cells. Other calculation operators include the hyphen, minus (D7 D8); the slash, divide (D9/D7); the asterisk, multiply (D7*D7); and the caret, exponentiation(d7^2). Excel has many built in functions, a few of those you will use during this quarter include statistical functions (SUM and AVERAGE), and mathematical and trigonometric (SIN and COS) functions. [To see the selection of possible functions press the F1 key, type worksheet into the white box, double click on worksheet_function and double click on one.] Try the SUM function: type =sum( into a cell, click on D7 and with the mouse button depressed highlight D8 and D9. You should see a highlighted box around the three cells. Type the other side of the parenthesis and press enter. Again there should be a 134. Moving Data Around the Spreadsheet Suppose you want to move a value from one position in the spreadsheet to another; select the cell and point the cross (cursor) toward the colored edge, the cursor should change to an arrow, at this point press the left mouse button and drag the cell to the desired position. Whole columns, rows or groups of cells can be moved in this fashion. To move a column or group of columns press on the letter of the first column, highlight the rest, point the cross (cursor) toward the colored edge, and drag them to their new location once the cursor changes. (Highlighting will be defined as clicking on a cell, holding the left mouse button down and dragging the cursor to the desired cell). Go ahead an try this now. Another very useful property is copy (or cut) and paste. Under the Edit menu you will notice that clicking on edit copy or edit paste or edit cut are equivalent to Ctrl-c and Ctrl-v and Ctrl-x. Most software packages (not just spreadsheets) include this feature. This means the user can copy and paste tables and graphs into word processing and other programs.
Moving the cursor around the spreadsheet With a screen full of values in a column it is easy to move the cursor around by pointing at a cell. But how would it be if there were 1,000 or more cells containing values? Using the arrow keys allows the user to move around the spreadsheet very quickly. Try going to the cell below the last value in your column by pressing the end key and then the down arrow. How far do you go? Do the same with the other arrow keys. To get the cursor back to the start of the spreadsheet press Ctrl-Home. Just pressing the Home key sends the cursor to the beginning of the row. Using the end and an arrow key combination in columns or rows will send the cursor to the end of the row or column depending on which arrow key is pressed. Try some of these. Modern spreadsheet programs such as Excel can hold many spreadsheets. At the bottom of the open program you should see many tabs labeled Sheet 1, Sheet 2, Sheet 3,... Each of these is a spreadsheet and cells in them can be used in a function. The whole file in excel speak is known as a workbook. B - Graphing Now that some of the spreadsheet fundamentals have been covered we will cover the preliminaries to graphing. (If you already know how to make an x,y plot skip to part C.) A.) Make an array of data consisting of 2 columns and 20 rows, 1-20 in one column and x^2 in the second. Begin in cell B3 by entering a 1. The rest of the values in column B (B4-B22) should be 2-20. The rest of the numbers can be entered directly, or you can use the equation x+1. Click on B4 and enter =B3+1. This can be done either by entering =B3+1 or by entering =, clicking on B3, and then typing +1. We want to do this for 20 rows. What has been done is convenient for the first two cells, but for the rest let us speed up the process; click on cell B4 and then Ctrl-c (copy) click on B5 hold the shift key down and click in cell B22, from B5 to B22 should be highlighted, and the press Ctrl-v. If done correctly there should a column of increasing numbers from 1 to 20. In newer spreadsheets the cells can be filled even faster. Let s try this in column C and then the values can be deleted or you can move to column D for the x^2 data. (To delete multiple cells highlight and press the delete key). Put a 1 in cell C3. In cell C4 enter the formula =C3+1 or the value 2. Highlight cells C3 and C4. Notice the square at the bottom right of the highlighted box. When the cursor is held over this it turns into a thin cross. Left click and extend the highlighted box to cell 22, and release the mouse button. This column of numbers should be the same as the first. In cell D3 enter =C3^2 and copy this equation down to D22. Often it is useful to label the columns, so type x (or X) in cell C2 and X^2 in cell D2.
B.) Let us plot this data: there is an icon for charts, but its position may change according to the activated toolbars (the strips of icons at the top of the sheet). If this icon can not be found the graph can be started by highlighting the data and choosing Insert, and then Chart on the Worksheet Menu Bar. Highlight the data of x and x^2, from 1 to 20 and click on the chart wizard (or Insert, Chart ) and choose x,y (Scatter) then finish. You should have the plot similar to the one below. 500 400 300 200 Series1 100 0 0 5 10 15 20 25 The purpose of a graph is to graphically display the results in a manner that allows another person to easily interpret the data. A table of course would give a tabular representation of the data. Even if the reader recognizes that this graph represents an exponential, it is hard to know what exponential has been plotted. This can easily be corrected by adding some labels. Right click on the graph (in between grid lines) and choose Chart Options. In the box Value (X) axis: enter observations, for Value (Y) axis: enter X^2. You might also want to give the graph a title in the Graph title: box Excel uses a grey background by default. Depending on the color of the data points it might be hard for the reader to see them. The background can be removed by right-clicking in the grey area, choosing Format Chart Area and under Area selecting none. The box color can be changed by clicking on the Color: box (or arrow) and selecting a color. Choose black and then OK. To remove the grid lines, right click on them and choose clear. To change the axes move the cursor over the a number on the axis to be changed and right-click. (Small squares will appear over the axis extremes.) Choose Format Axis. There are a number of choices under this menu choose one of the tabs and experiment. Pressing Ctrl-z will restore your graph to its last form. Multiple Ctrl-z s will usually step back a few more times. The legend, Series 1, can be removed by clicking on it and pressing delete, or it can be given a name (if there
were more than one curve) by right clicking on the graph, choosing Source Data, the Series tab, and entering either text or the cell position of the label in the box marked Name:. X^2 500 400 300 200 100 0 0 10 20 30 observations Series1 C - Plotting Some Functions Now that you have some feeling for the power of functions and the ability to graph try plotting a function on your own. You can choose one of the following e^x, Cos(x), Sin(x), Cos(x)*Sin(x) or make one up. x will be replaced with a cell address. The trigonometric functions Cos, Sin, Tan, etc... require the value in the parenthesis to be in radians, so multiply by PI()/180 (PI() is another excel function for ð, 3.14) if x is in degrees (x * PI()/180). A starting point might be to type step size in cell B5 and 1 in C5 if you choose e^x or 180 in C5 if you choose one of the trigonometric functions. Set the first x value to 0 in say C8 and the second value to C8 + $B$5. Copy this for 20 or so values. The dollar signs mean always reference the specified cell. A single dollar sign in front of the column ($B5) holds that column while changing rows. Switching (B$5) holds the row while allowing the column to change. Try the copying with and without the dollar signs and see what happens. When you have a set of x values make some y values and then a plot. When you have plotted your function change the value in C5 and see what happens. D - Statistics Average and Standard Deviation Often in business or science one wants to examine the relationship between a set of numbers. At home the set of numbers might be the total monthly expenses or the expense for gas, water, electricity for the year (or several years) or gasoline. The occupant might want to determine the average expense or which month the cost was highest. In science one might want to determine some physical property from the data set.
Type the following numbers into a column 149.9, 152.9, 166.9, 167.9, 177.9, 181.9, 185.9, 184.9 assume these values represent the price of gasoline for eight months and we want to know the average price for these months. You could add them individually in one long formula and divide by eight, use the sum function and divide by eight, or use the AVERAGE function. Try it using the last method. Type =Average(, highlight the values you want to average, and type the end parenthesis. The average is 170.0 cents/gallon. One statistic that is often used is the standard estimate of the variability of the data. This is called the standard deviation and is calculated as nσx ( Σx) n( n 1) 2 2 where in our case x is each gasoline price and n is the number of prices. Fortunately excel has this as a built in function and all you need to do is type =stdev(, highlight the values you want to use, and type the end parenthesis. For the gasoline prices used above, the standard deviation turns out to be 14.0. Linear Regression If we had the following set of data we might be interested in whether there was a linear relationship between the x and y variables. Using a spreadsheet it is very easy to plot the data and draw through it a best-fit straight line. The equation for the line is of course y = mx + b. Excel easily determines the slope of the line (m) and the y intercept. With a bit of algebraic manipulation the x intercept can also be determined. Go ahead and enter the following values into some cells and then calculate the slope and y intercept. x y 1.0 1.9 2.3 3.0 3.2 3.1 4.5 4.6 5.8 5.1 6.9 5.8 8.1 6.5 7 6 5 y intercept 4 y = mx + b x intercept 3 2 y = 0.6479x + 1.3422 R 2 = 0.9844 1 0-2 0 2 4 6 8 10 y x There are a number of ways to get Excel to produce the best-fit line, the simplest is to use the trend line feature of the graph. To do this click on a data point (notice that all the points are highlighted) and then without moving the mouse right click on the data point and select Add Trendline. Then choose the
Options tab and click the bottom two squares (Display equation on chart and Display R- squared value on chart). And then OK. If the values entered are the same as the ones above the equation for the line will be the same. See if you can determine the value of the x intercept. [An aside to the data entry: notice how Excel wants to display the values 1.0 and 3.0 as 1 and 3. The program can be forced to provide the same number of significant figures for all values by highlighting the values, right clicking on them, choosing Format cells, the tab Number, the Category Number, and then entering (or selecting using the arrows) the number of Decimal places.] R 2, called the correlation coefficient, is a measure of the degree of linear correlation between x and y values. It ranges from 0 when there is no correlation to +1 when there is complete correlation. Change some of the values in the spreadsheet and examine the change in R 2. Although the trend line option is quick and gives the y intercept and the slope it does not provide any information about the values themselves. We will call this the uncertainty. As an example of the uncertainty consider the following: you have been invited over to a friends house for the first time and although you know the town in which they live you don t know the name of the street on which the live nor their address. You are cannot be very certain about where they live. If you had the name of the street they lived on and maybe the color of their house you would be more certain, and if you had the street address you would know exactly where they lived. The slope and intercept also have uncertainties attached to them. If the uncertainties are as large as the values themselves the numbers may not be very useful. Excel can give this information by performing a linear regression on the data. We will not concern ourselves with the mathematics for a regression, we will simply use the built-in routine. On the your computer make sure a cell in the spread sheet has been selected (not a graph) and choose Tools. At the bottom of the menu there should be a selection for Data Analysis. If you don t see this option then it is possible that the available add-ins have not been started. To start them choose Tools, Add-Ins, and make sure the boxes for Analysis Tool Pack and Analysis Tool Pack - VBA have been selected with a check, and press OK. When the add-ins have been implemented choose Data Analysis (notice how many analysis tools there are), Regression, and OK. You are now in the area where you tell Excel on what data you would like to do a regression. For Input Y Range highlight the values in the y column (click on the multicolored box and then choose the values). Do the same for the Input X Range. Click on the white circle for the Output Range and then click in the multicolored box for
this option. For some reason Excel jumps back to Input Y Range and if you are not careful you will place the output range here and excel will give an error message when you try to perform the regression. Go ahead and choose $B$18 for the output, making sure it is in the right box, and click on OK. The regression gives the following output: SUMMARY OUTPUT Regression Statistics Multiple R 0.992154 R Square 0.984369 Adjusted R0.981242 Square Standard Error 0.22855 Observations 7 ANOVA df SS MS F Significance F Regression 1 16.4474 16.4474 314.8719 1.04E-05 Residual 5 0.261176 0.052235 Total 6 16.70857 Coefficients Standard t Stat P-value Lower 95% Upper 95% Lower Upper Error 95.0% 95.0% Intercept 1.342233 0.187025 7.176754 0.000817 0.861471 1.822996 0.861471 1.822996 X Variable 1 0.647936 0.036514 17.74463 1.04E-05 0.554073 0.741799 0.554073 0.741799 Notice that R square (under Regression Statistics), and the slope (X Variable 1) and intercept (under Coefficients) are the same as with the trend line option. Of the rest of the information we will only make use of the Standard Error (uncertainty). We have used two significant figures for our x and y data so the slope (m) should be reported as 0.65 ±.04 and the y-intercept as 1.3 ± 0.2. To test the standard error change the first and second y values to 7.0 and 1.0. Choose regression as was done previously, but make the output cell $K$18. Compare the increase in the standard error for both the slope and the y intercept.
DUE Use Excel to plot the following temperature-volume data: T (EC) V (ml) 0 190 10 192 25 200 32 210 33 215 and do a linear regression. The equation being plotted is V = (nr/p)*t + b. The value (nr/p) is the slope and b is the y intercept. The volume is in ml and the temperature is in EC. Don t worry about converting to Liters or Kelvin. Calculate the X intercept (the value of T (or x) when V (or y) equals 0). Set V equal to zero and rearrange the equation to: b = (nr/p)*t or b = slope*t Solve for T: T = b/slope Extend your plot axes so that the regression line crosses the x axis. Does the value on the plot agree with your calculation? Print-out and turn in your spreadsheet during Thursday s recitation. Make sure that the plot axes are labeled. Make sure the x intercept is clearly shown on the graph. Give your graph a title. Don t forget your name! Include your TA s name also. Highlight the calculation of the x intercept. Include the correct units for the value. Answer the following question: What important scientific value is obtained from the x intercept of your graph? Make sure you print out two copies of your spreadsheet - one to turn in and the other to save. Or you can email a copy of your spreadsheet to yourself.