A Brief Overview of Using STATA on the HPC Windows Terminal Server

Size: px
Start display at page:

Download "A Brief Overview of Using STATA on the HPC Windows Terminal Server"

Transcription

1 A Brief Overview of Using STATA on the HPC Windows Terminal Server Logging in to the HPC Windows Server: You must begin by logged in to the High Performance Computing Windows terminal server. : On Windows machine, select Start All Program Accessories Communications Remote Desktop Connection. In the Computer name, type ithaca.unbc.ca Note: If you want to access your local drive on your computer you need to make that selection on the Local Resources tab under Options >> After a few seconds, the following login screen should then appear: Login using your UNBC user name and password making sure that you are logging in to the UNI system. Making sure you can access the StuDept (S:) Drive: When you first login, we need to make sure that you can access the S: (StuDept) drive because this is where the sample files are stored. To check if the drive is mounted, or mount it if necessary, you need to launch Windows Explorer. From the Start Menu, select Start My Computer 1

2 Check to see if StuDept is mounted as one of your network drives: If it is, then you can skip the next steps as we work through how to mount a network drive. To mount a new network drive, select Map Network Drive from the Tools menu: When the following dialog box appears, select the letter S: for the Drive (don t worry about the \\PG- UNI... portion as it won t be on your screen). Ensure that the Reconnect a login check box is checked (so you won t have to go through this every time you login) and then click on the Browse button: Expand the UNI tree (this may take several seconds) and scroll down until you get to the Pg-uni-dc-02 note; click on that plus sign next to the node and the screen should resemble the following: 2

3 Select StuDept and then click OK Your S: drive should not be accessible through Windows Explorer (and as long as you had the reconnect checkbox set, you should only have to do this once). Check in Windows Explorer to see that you can see the S:\NRES798EA directory, but don t do anything with the files for now. To run STATA on the HPC Once you have logged in to the HPC windows terminal server you need to start STATA from the Start Menu (note that we currently have on five simultaneous STATA logins available under our network license): Stata Overview Statistical packages differ in their use of graphical user interfaces (GUI) and command-line processing. SAS for example, will allow you to use both, but they are not that well integrated. STATISTICA has moved away from a batch file language, per se, and allows you to develop Visual Basic macros, which can be quite useful for streamlining repeated processes, but require a knowledge of Visual Basic to 3

4 manipulate. Other packages just allow you to use a graphical interface (and perhaps keep a log of what you did), but that is not the same as keeping a script or batch file that you can reuse to run the same analysis. STATA is one of those packages that offers you the best of both. Consequently, you can access all of Stata s data management, statistical, and analysis features from the menus and associated dialogs, or directly from the command line. When a command is accessed via the menus, the corresponding text command appears in the Command Window and in the Command Review Window. Please note, however, that even when developing an analysis from the graphical interface (and that is a great way to learn the syntax of any command particularly the graph commands), it is always best to create a do file (STATA s version of a batch file), so that you know exactly what you did when looking back at your results (or looking to do a similar analysis). The reason for using a text interface is first and foremost for reproducibility of your analyses and results! Stata is also an excellent tool for data manipulation: moving data from external sources into the program, cleaning it up, generating new variables, generating summary data sets, merging data sets and checking for merge errors, reshaping data sets. The vast majority of Stata commands are written in Stata s own programming language the ado-file language. If a command is not built in to the kernel, Stata searches for it along the adopath. Like the PATH in Unix or DOS, the ado-path indicates the several directories in which an ado-file might be located. This implies that the Stata commands are not limited to those coded into the kernel. If Stata s developers tomorrow wrote a command named levels, they would make two files available on their web site: levels.ado (the ado-file code) and levels.hlp (the associated help file). Both are straight ASCII text. The new files (or any files that you might want to use for an analysis can be easily downloaded because STATA is actually a web browser so it can be easily used to search for commands both on the local machine (e.g., search tukey for the local machine, and net search tukey for web-based resources and add-ins. NOTE, however, that if you are running a copy of STATA on your own machine (we do have access to a student package through STATA) you will have write access to all of the directories on your computer and can easily add new ado files to your copy of STATA. We do not have similar access on the HPC windows terminal server, so if you need a add-in added to the STATA system, please me with the name of the ado file and I ll arrange to have it added to the HPC version as quickly as possible. 4

5 The STATA Windows Variable list Results displayed here Previous Commands Commands entered here Note that the window arrangement may vary (and you can change it). You can easily use the preference options to restore or save defaults (if you are having problems with the floating windows then I suggest loading the Maximized Window Settings): Basic Windows The four default windows in STATA are: 1. The Results Window where all of your commands and their results are displayed (with the exception of graphs which are displayed in their own window). Anything displayed in blue can be clicked on to get help or other information. 2. The Review Window where just your commands are displayed. You can click on any command in the window and it will be pasted to the command window. The Review window has one extra option in its windows icon menu: "Save Review Contents." This will allow you to save everything in the review window to a file for later use. This is not a substitute for using log and do files, however see below!!! 3. The Command Window where you type your commands when working in interactive mode. Everything you type in here is echoed in the Results window as well as the Review window. The "Page Up" and "Page Down" keys can be used to scroll back and forth through 5

6 commands you have executed previously. You can also copy and paste between this window and your "do" file. 4. The Variables Window contains a list of all your variables and their labels. You can click on a variable here and it will be pasted to the command window. Additional Windows There are three other windows in Stata: 1. Do-file editor - this is Stata's simple text editor for writing do-files, or programs. Eventually, you should do all of your work in a do-file so you can reproduce what you did later on (more on that below). 2. Viewer is used for displaying help and log files. Like the Results window, anything displayed in blue can be clicked on for more information. 3. Graph is where all of your graphs will be displayed. Remember that in any window, anything displayed in blue is a hyperlink to additional information (e.g., another explanation or related command), or a way of looking on the web for additional topics and program add-ins. Getting Help in Stata There are several ways of getting help in Stata. The first, of course, is the "help" command. You can enter the help command from the Command window and the help text will be displayed in the Results window. Or, you can use the "Stata command" option in the Help menu and the help text will be displayed in the Viewer window. The "help" command is good only if you know the command for which you want help. If you do not know the command, then you should use the "search" command which actually does a keyword search of Stata documentation. The search command (or help menu option) has two versions: one which limits the search to your computer and one which searches Stata's website. Often when you use the search or help in Stata you will find commands that do exactly what you want, but when you try to use them on your own computer, you get an "unrecognized command" error. That is because many commands in Stata have been written by other Stata users and are not part of "official" Stata. These are often referred to as the "STB" (Stata Technical Bulletin) or "User-written" programs and you must install these commands yourself. Stata makes this very easy as you can just click your way through various menus in the viewer. Accessing Commands When learning the package, you can select any feature from the Data, Graphics, or Statistics menu and fill in the resulting dialog. All features can be found in the menus, from generating a new variable to match-merges and reshaping datasets, from tabulations and summary statistics to negative binomial regression of a count outcome with survey data. When you close the interface window, note that the command-line equivalent appears in the command window. 6

7 Using Data There are several ways of getting data into STATA, but the easiest is the "use" command. Because the paths on the HPC server can be complex, the easiest way to see the full syntax of the use command is to use the File Open dialog on the main menu. By default this will be looking for a.dta file (which is STATA s data file format). The "use" command is only for data that are already in Stata format (as opposed to text, "raw" or Excel). You can also "use" data that are stored somewhere on the Internet by simply using the URL and the filename! All of the options for importing data are beyond this quick overview, but data from GIS programs and Excel can be imported as comma-delimited files, or in the case of Excel, you can actually paste a block of data (along with column headings) from the clipboard directly into the data editor (accessed by edit in the command window). When you are finished with a session, you must save your data and you do this with the "save" command. If you have made many changes to your data, then you will either need to specify a new name for the file or include the "replace" option. You need to be quite careful when using commands (like compress) that create new data sets, because you don t want to inadvertently overwrite your original data the solution is always keep backup copies of all of your files! Do and Log Files There are two important rules to remember with with Stata: Always do your work in a do-file! Always have a log file running! Having a do-file is the only way to make sure that you can reproduce your results at a later time! They are a record of what you did and why (assuming you added plenty of comments). Log files also are a record of what you did, but they also contain the results of what you did. By always keeping a log file (ideally that matches the name of your do file) will keep a hardcopy of your results in case you lose the printout that you might produce from the Results window. You can start a do-file by simply clicking on the do-file editor button. Make sure that the "Auto indent" and "Auto save on do/run" options are checked in the Edit Preferences menu. You can enter any valid Stata command in the editor and click on "Run" to run all of the commands. You can even highlight one or more commands to run only those commands. You can also create STATA do files with any text editor (TEXTPAD is a particularly powerful one) and then either run the do file from STATA or paste chunks of commands from the clipboard into the Command Window as you are developing and checking your do files. To start a log file, type log using mylog.log (or whatever you want to call the log file) in the Command Window. When you use ".log" as the extension, Stata automatically creates the log as a plain text file that can then be opened in Textpad, MSWord or notepad as well as Stata s viewer. Everything you see in the Results window goes into the log with the exception of output from the "help" command and graphs created with the "graph" command. When you are done, you can close the log with the "log close" command. 7

8 Types of Variables String variable: a string variable is one that has letters and/or numbers as opposed to just numbers. An example would be a person's name. Numbers can be treated as strings, but strings cannot be treated as numbers. Strings are also referred to as "character" or "alphanumeric" variables. Numeric variable: These are numbers, plain and simple. Decimals, commas, and minus signs are the only acceptable non-number characters allowed. Variable label: A variable label is a short description of a variable. These appear in the Variable window in Stata and also in the output. They are not required, but make the output easier to understand. Value label: Like variable labels, value labels make the output easier to read. Instead of printing 1, 2, 3, Stata will print Yes, No, Maybe. Creating and Manipulating Variables In Stata, there are three basic kinds of variables: numeric, string and date. Although dates are technically stored as numeric data, their use is different from regular numeric data. Numeric variables can be integers, decimals, negative and positive. In the output from the describe command, numeric variables can show up with several different "Storage Types." The specific meaning of these is nothing you need to be concerned with right now, just know that anything other than "str" is a number. Many analysis commands like "reg" and "sum" will work only on numeric variables. If you receive a "type mismatch" error, then you are probably trying to do an analysis on a string variable. String variables, often referred to as "character" or "alphanumeric" variables, are variables whose values may have letters or other special characters in them. It is possible to store numbers as though they were letters (a common source of the "type mismatch" error). Date variables are a special case of numeric variables. Although they are often entered as strings (i.e.: 01JAN1992 or 01/01/92), they must be stored in Stata as numbers to make them useful. Stata has several commands for working with dates and time-dependent data. Briefly, Stata stores all dates as the number of days (or months or quarters, etc) from January 1, Dates before then are negative, and dates after are positive. Other packages such as SAS or SPSS use the same date as their origin, however, Excel uses January 1, 1900 by defualt (it can use other dates, so you need to make sure). If you are importing data from an Excel spreadsheet, you must re-format them and import them as strings. Rules for Variables Variable names can have up to 32 characters (8 or less is better, though) and must begin with a letter. Variable names are case-sensitive. Use descriptive names var1 means nothing. The question number from a survey is a good choice. When listing variables in commands, you can use the "?" and "*" to represent any single character or any number of characters, respectively. Values for string variables can be up to 80 characters in Intercooled Edition. Anything over these limits will be dropped. Values for string variables are enclosed in double quotes. 8

9 Missing values for numeric variables are represented with a dot: "." Missing values for string variables are represented by two double quotes with nothing in between: "". This is not the same as a blank! A Sample Session We will return to more on commands later, but we ll begin by just working with the interface and some files and simple commands. Always begin by starting a log file. You can do this by either type log using test.log, text replace in the Command Window or by using the File Log Begin command: When you select the name of the log file, be sure to use the following dialog to select a log (in text format that can be read by any text editor) instead of the scml format (make sure that the ending is.log as well): From the File Menu, select Open and then navigate to the S: drive, open the NRES712 folder and then double click on the file sample_moose.dta. You will get an error message that you don t have enough memory to load the file, but don t worry about that for now. Notice instead that the command: 9

10 use "S:\NRES712\sample_moose.dta", clear has appeared in the review window. Before we can load this large file, we need to first clear the portion that was loaded from memory and then increase the memory allocation for STATA. We will do both in the Command Window. In the Command Window, type clear and hit return. The command clear appears in the Review Window and all data has been cleared from memory. Before we can try to load the file again we will increase the memory allocation by using the set command. In the Command Window, type set memory 600M This has changed the amount of memory available for the data file. Instead of using the File menu again, click on the use "S:\NRES712\sample_moose.dta", clear in the Review Window. If you single click on the command in the Review Window, it will appear in the Command Window (this way you can alter a command slightly before executing it). Just like a command that you typed in you execute the command by hitting Enter. Alternatively, if you double click on a command in the Review Window, it will be executed immediately. In the Command Window, type describe A full description of all of the variables (and there labels) appears in the Results Window. To see a summary of a variable or variables you can use the sum command. In the Command Window, type sum [abbreviation for summary]. Instead of typing the name of a variable, scroll through the Variable Window and single-click on the variable elevation the name of the variable appears in your Command Window. Now click on the name aspect in the Variable Window. The Command Window now contains the command sum elevation aspect. Hit return and you get the results of this command in the Results Window and the command is added to the Review Window. Supposing you wanted to have confidence intervals and measures of variation for both of these variables ci would be the command to use. You could type in ci elevation aspect in the Command Window, but instead (assuming you are still in the Command Window), press the Page-Up key. You scroll back one command and then replace sum with ci and hit return. If the command that you wanted to reuse was far up in the command list, you could single click on the command in the Review Window. A very powerful feature of STATA is the ability to easily work with categorical variables to partition analyses. For example suppose that you wanted the means and standard errors for elevation in each season. To see what the season are, type tab season [tab for tabulate] in the Command Window and a list of the five seasons we used in our analyses of these moose data appears in the Results Window. To get the standard errors for elevation of the five seasons you would use the command: by season, sort: ci elevation. Note that the by construct only works if all of the cases in the data set are sorted by the variable you are using for grouping in this case season. You could obtain the same results with the command bysort season: ci elevation. 10

11 In this dataset, the variable pttype indicates whether the data were observed (GPS locations in which case the value of pttype is 1) or random (pttype is 0) points. We can use the tabulate (tab) command to quickly summarize the data (and look for any data entry errors). In the Command Window type tab pttype season This will produce a table of the number of used and random points in each season. We could have also obtained this information with the count command, which we ll now use to illustrate conditions (and to convince you how handy the tab command is). When using conditions, you need to use the value of the variable, not its label. To illustrate this, again type tab season (or page-up to get it, or click on the command in the Review Window). When you execute this command you see the following in the Review Window:. tab season Season Freq. Percent Cum Calving 7, LateWinter 18, Rut 17, Summer 14, Winter 29, Total 87, We will look more at labels later, but for now the order of the labels tells you their value: Calving is Season 1, LateWinter is 2, etc. We can see the levels of pttype by typing tab pttype. tab pttype Pttype Freq. Percent Cum , , Total 87, To see how many cases were known points (pttype of 1) in calving (season is 1) you could type count if pttype==1 & season ==1 note that for conditions in STATA you use == no = and the & indications a second condition that both need to be true to return the requested information. Check to see that this number corresponds to the results of the tab pttype season command you used before by scrolling up in the Results Window. To use STATA s data editor you can either click on the toolbar icon, or type edit in the Command Window. Note that STATA also has a data browser (essentially same as the editor except data can not be changed) and this can be invoked with the toolbar icon or by typing browse in the Command Window. 11

12 Invoke the STATA editor by typing edit in the Command Window. You can see that the editor it like a spreadsheet with columns corresponding to variables and rows to observations. You can also copy and paste within STATA or between spreadsheets, such as an Excel spreadsheet (note, however, that the size of the STATA file allows much larger data sets than those allowed in Excel). A very quick way to import data into STATA from Excel is to highlight a block of data in Excel (beginning with the Variable names in the first row) and then paste that block directly into the STATA data editor the first row containing the variable names will be interpreted by STATA as the variable names. To exit the data editor you can either click on the icon and then select close from the drop-down menu, or else click on the in the top right corner of the data editor. Examining the Log File Now lets look at the log file that you have created. Begin by closing the log file. The simplest way to do this is by typing log off in the Command Window. Now using Windows Explorer (Start My Computer) navigate to c:\data and then double click on the file that you have been logging to. You should see all of the information that has been displayed on the Results Window. Creating and Running Do Files Next we will look at an already written do file in STATA s text editor and then run the file. You can either invoke the editor by type doedit in the Command Window or by clicking on the Do File Editor Icon on the Toolbar. The Do Editor Window should then open. Using the editor, load the file test_moose.do from S:\NRES712 The contents of the test_moose.do should look like the following: /* This do file was generated by an excel macro Stata Version

13 */ /* Begin by increasing memory */ set memory 600M capture log close set more off log using moose.log, replace text use "S:\NRES712\sample_moose.dta", clear * 598 during Calving ElevxAspectxHabitat desmat: habitat=dev(6) /* */ if animal== 598 & season== 1 & habitat~=1 & habitat~=2 & habitat~=3 & habitat~=4 & habitat~=8 & habitat~=9 & habitat~=10 & aspect~=5, robust coef desrep(all notrunc) mlfit lroc, nograph log close Close the Do Editor from the File Menu or by clicking on the in the top right corner of the editor. To run a do file, Select Do from the File Menu: and then navigate to the S:\nres712 directory and open (or double-click) on test_moose.do. As you will see from the Review Window once you have closed this dialog, you could also have typed do s:\nres712\test_moose.do (note that are needed when you specify a path along with a filename The results of the logisitic regression that you have just run is displayed in the Results Window. Confirm that you have a log of the output by looking at the log file, which you specified should be written to c:\data\moose.log (Note that you would normally want to log to a directly that you would have permanent access to such as on your H: drive). 13

14 Examining the Output Open Windows Explorer (Start My Computer) and then navigate to c:\data and open the file moose.log: What you should see is a complete log including your source data file, the exact commands that you gave to STATA, the output from the Logistic regression (more on this in another lab, but we are using DESMAT to help deal with the categorical variables that are being used as independent variables such as Habitat), and various estimates and measures of fit for the regression. Note that in this regression, only Elevation, Elevation^2, and the intercept (_cons) were significant. 14

15 Comment your Do Files Below is the default outline that I use for a do file (note the frequent use of comments) /* include version number */ /* and */ delimit a block comment that can wrap across lines version 9.2 * if a log file is open, close it If a line starts with a *, the rest of that line is ignored capture log close * don't pause when output scrolls off the page set more off * log results to file myfile.log myfile.log will be a text file and a path can be specified log using myfile, replace text /* do file commands start here */ put your commands here * * close the log file log close Comments are printing directly through to the log file, so make sure to also include comments with your commands they will help you interpret your output! Importing Data into STATA STATA can import (and export) a variety of data formats. The formats can all be accessed through th command line, but I find it easiest to use the menu system to get the syntax correct. For example, you can import a comma-delimited file created in Excel (and saved as a csv file) or by a GIS program such as ArcView or ArcMap by selecting ASCII data created by a spreadsheet in the below example: 15

16 This will bring up the following dialog box, which you can then use to navigate to your specific data and use to specify the file format etc. If the first row contains the names of the variables (a good idea) then they will be automatically converted to variable names in STATA. By checking the Replace data in memory checkbox, the clear option is automatically added to the command and all data in memory will be cleared without saving it! Using the Clipboard to Import from Excel Another way to get some data sets into STATA is to use the clipboard to transfer directly from Excel to STATA (note that STATA is capable of handling much larger data sets than Excel so reversing this process, i.e., copy STATA data sets to Excel may not be possible for large data sets such as the moose file you were using earlier in this lab). Using the Start Menu All Programs Microsoft Office Excel, launch Excel. Now open the file called painted.xls, which is in the S:\NRES798EA directory. These data are for some morphometric measurements of painted turtles. Using the following illustration as a guide, highlight the entire block of text making sure to include the first row that contains the variable names (do not select the entire columns, just the illustrated block of text). 16

17 With the text highlighted, either right click the mouse and select copy, or copy the text to the clipboard using the menu system. Now go back to STATA. It would not make sense to add these data to the data currently in memory, so we need to begin by clearing the current data. To do this type (assuming of course that you don t want to save any changes before doing so): clear To paste the data that are currently in the clipboard into the STATA Editor, we need to open the editor, so type: edit The cursor should be in the 1 st row and 1 st column, so just press shift-insert, which is a key short-cut for paste. The data that were in the first row of the Excel file (the variable names) have now been used as the names of the variables (although they are now all in lower case). Having lower case variable names is a good practice, because STATA is case-sensitive, and by not using upper-case letters in naming variables, you make it easier to specify them when using them in analyses. Exit the Data Editor and return to STATA. 17

18 Whenever you enter new data, you want to make sure that they have been interpreted they way you intend them to be. With a real analysis you would also be looking for outliers, checking sample sizes, and looking at the data. Use the following commands to do check these data. describe tab sex by sex, sort: sum edit sort obs gen order=_n edit Describes all of the variables in the current dataset Tabulates the number of observations by the variables sex; because there turtles of unknown sex, you all see the number of frequencies in the missing sex category This command will give you a summary of all of the continuous variables (because you didn t specify particular variables) with data for males, females, and unknowns treated separately. Normally you would not file, but what I want you to notice here is that the data are no longer in the original order this is because you just used a sort command. There are several reasons, including adding more variables for each case, why you may want to get the data back into their original order. This is possible with this data set, because you have a variable that is observation and it specifies the original data order. Exit edit and we ll see how to create a variable that would have saved the original order. We are just doing this to return your dataset to its original order This is your first use of the generate command that is used to create new variables. In this case you are creating a new variable called order which is being set to _n. _n is a special STATA variable that stores the case number, so your new variable contains the current order of the cases in your dataset (the original order) allowing you to return to this order at any time by typing sort order If you return to the Data Editor, you will see that the variables obs and order have exactly the same values ; exit the editor replace sex="unknown" if sex=="." This command is going to replace all of those instances of a missing sex (i.e.,. ) with the value of UNKONWN. If you just typed replace sex="unknown" then all values of sex would be changed, but with the if condition you are only going to make the change if sex has the value of. (remember that in a condition, STATA uses == for =. tab sex You can now see that there now three sexes by sex, sort: ci This command is giving you the mean, SE, and 95% confidence interval for the data grouped by the 3 levels of sex. scatter weight length Produce a scatter of plot of weight (Y) versus length (L) scatter weight length if sex== MALE Produce a scatter of plot of weight (Y) versus length (L) on for those data in which sex is MALE 18

19 STATA can produce a wide range of graphs (even improved in STATA 10), but these are most easily produced by using the graph menu at least to learn the specific syntax). Normally you would do the graphing interactively rather than including them in a do file and trying to output them to a text log file. Using Existing Commands to Make a DO File By this time, the Review Window contains many commands. Suppose that you now wanted to turn several commands that you have experimented with into a DO file (as in the assignment that follows!). If you right-click the mouse over the contents of the Review Window you can either Save the Review Contents or Copy the Contents to the Clipboard: Copy the contents to the clipboard and then open the Do Editor by typing doedit in the Command Window (you could also paste them into Notepad or into Word, but if you do the latter make sure to save the document as a.txt file and not a formatted.doc file). You now can use these commands as the basis for constructing a do file (after adding logging, opening of data files, etc.). Exercise: Using the data in painted.xls, create a STATA data file and then save the painted.dta file just as it comes in from Excel via pasting to the Data Editor. Then write a do file that begins by starting a log file, renames cases where sex is missing to UNKNOWN, describes, the data, summarizes the data by the categories of sex, and provides confidence intervals for all continuous variables based on the categories of sex. The do file should end by closing the log file. Make sure that you do not save the.dta file again because once you have replaced the unknown sex cases, the replace statement you are going to use in your do file won t work! Please me a copy of your do file and the completed log file that results when you run the do file! Search and Net Search search t test net search desmat help ci Labeling Variables 19

20 String Variables We'll begin with string variables since they are the easiest to work with. As in any package, their values are case-sensitive. gen firstname="paul" gen initial="abcdefghij" replace firstname="bob" if employed==4 replace firstname="sue" if mugged!= 3 replace firstname="none" if firstname=="" To create a string variable, use the gen command. Enclose the values themselves in double quotes. Not everyone's first name is Paul, so we will need to change the values for some observations. The "if" clause allows us to do this. In the first example only those observations whose values for the variables "employed" is 4 will be changed, all other observations will not be changed. Note the double "=". The second example will change all observations whose values for mugged are NOT 3. Finally, we can even change values of a variable based on itself - in this case, we change all the missing names to "none." Note the double equal sign in the if clause! Numeric Variables Creating and manipulating numeric variables is just as easy as string variables. gen numvar1=1 gen numvar2=numvar1+income gen numvar3=(numvar1/income)*100 replace numvar1=5 if mugged==3 replace numvar2=income/rand if numvar3>.05 replace numvar2=income/rand if numvar3>.05 & Just like with string variables, you can create new numeric variables with the gen command. Any valid mathematical expression is allowed. Replacing values in numeric variables works much the same way as for string variables. We can use "if" clauses in replacing numeric values as well. 20

21 numvar3!=. replace numvar2=. if citynum==2 citynum==5 citynum==7 replace numvar2=. if inlist(citynum,2,5,7) recode mugged 1=2 recode mugged 1=2 3=4 recode mugged 1=2 *=5 recode mugged =5 recode mugged =5, gen(mugged2) gen income_dummy=. replace income_dummy=1 if income>=6000 replace income_dummy=0 if income<6000 tab mugged, gen(mugged_dummy) One caveat that often comes up is how Stata treats missing values. Since missing values are equal to positive infinity, the expression "numvar3>5" will include missing values. This may not be what you really want, so you must include the "& numvar3!=." to exclude any missing values. This will make numvar missing if citynum is equal to 2, 5 or 7 A very useful function is inlist which allows you to simply list the values you want to match. The recode command can be an easy way of changing the values of a numeric variable (recode only works with numeric variables). All you need to do is just provide a list of the values you want to change. The "*" means all other values not explicitly listed - including missing! Finally, the "gen()" option tells Stata to create a new variable that will be the recoded version of the original. This is highly reccommended so that you do not destroy your original variable! This is one way of collapsing values. Dummy variables are numeric variables whose values are 0 and 1. There are two basic ways of creating dummy variables, one is for when you are creating dummies for a continuous variable, and one for a categorical variable. 21

22 Extended Generate (egen) Egen is one of Stata's most powerful and useful commands. Like generate, it is used to create new variables, but it is much more than that. Egen can create variables that would be difficult and tedious to create on your own. Some examples are variables whose values are the mean of another variable for each group such as income for males and females. Egen can also create other variables that count the number of observations that fit a certain criteria, or even simply number observations. The only way to truly see how powerful egen can be is to show a few examples and then have you explore the other available functions on your own. egen age_cat = cut(age), at(10,15,20,25,30,35) egen age_cat = cut(age), group(6) egen age_mean = mean(age), by(year) "cut" is very useful for collapsing variables. You can either specify the lowest value for each new group with the "at()" option. Any observations with a value less than 10 will be given a missing value for age_cat, and all observations with a value greater than 35 will be placed in the "35" age_cat group. or simply specify the number of groups you want with "group()". This creates a variable that is the mean of age for each year. In addition to mean, there are min, max, sd, and several other statistics. egen numobs = count(personid), by(personid year) egen city_yr = group(cityname year) egen city_yr = group(cityname year), label egen comp_id=concat(householdid familyid personid),decode p(/) "Count" simply counts the number of observations within each year. This can be used to make sure that you have the same number of observations for each respondent in each year. "Group" numbers the groups formed by crossing cityname and year. The groups are numbered consecutively which makes this a good variable to use in analysis. The "label" option causes Stata to use the value labels (if any) of cityname and year in creating city_yr. The "concat" function is very useful when you have two or more variables that you want to combine to form one variable but adding or multiplying them would not make sense. The "decode" option works like the "decode" command in that it uses the value labels to create the new variable. The "p()" option allows you to put a separator character between the values. 22

23 Converting Between String and Numeric Variables Before we get into date variables, it will be useful to learn how to convert string variables into numeric and vice-versa. Sometimes, for various reasons, a number will get read into Stata as a string. We must convert it before we can do any analyses on it. There may even be times when we want to treat a numeric variable as a string (such as Soical Security Numbers or other ID variables), although not as often. There are four commands that allow us to make these different conversions: "destring," "decode," "encode" and using the "real" and "string" functions with the gen command. destring d_income, gen(inc_pct_num) ignore("$") destring inc_pct, gen(inc_pct_num) percent destring inc_pct, gen(inc_pct_num) percent force The "destring" command will convert a string variable into a numeric variable. It is used particularly when you have data that include special characters such as dollar or percent signs. The general form of the command is to specify the string variable, generate a new numeric variable, and the character or characters you want to remove. If you have a percent variable with a percent sign, you can use the "percent" option. This has the same effect as specifying ignore("%") and then multiplying the result by 100. gen numvar = real(str_num) encode habitat, gen(habitatnum) decode citynum2, gen(cityname) gen city_str2 = string(city_num) Using the "force" option tells Stata that if it can't make a proper conversion, then the new variable should have a missing value. The "real" function simply tells Stata to convert all numbers in strvar into numeric data. Anything that is not a number will be made missing. Use the real function only when you do not have special characters. Sometimes you have a legitimate string variable such as habitat types or forest cover. To use this variable in a statistical analysis such as a level of an ANOVA, it must be numeric. The "encode" command will accomplish this. A nice feature of this is that the character values will be used to automatically create value labels for the new numeric variable. To convert a number into a string, you can use the "decode" command. One caveat to the decode command is that the numeric variable must have value labels assigned. If you have to many values to bother making labels for, you can still make the numeric to string conversion using the "string" function with the generate command. 23

24 By Groups You can execute almost any command on each level of a variable by prefacing it with "by." Unfortunately, "by:" only works with the "sort" command, not "gsort". by citynum: sum income by citynum year: sum income by citynum, sort: sum income The "by citynum:" command prefix simply tells Stata to execute the "sum income" command on each citynum separately. You can list more than one variable in the "by...:" prefix. Stata assumes the data are already sorted in this order and result in an error if they are not. You can use the "sort" option to tell Stata to sort the data if they are not already sorted. by citynum, rc0: reg income year by citynum (year): gen income_lag=income[_n-1] Finally, the "rc0" option tells Stata not to stop if it encounters an error along the way. Some statistical analyses require a minimum number of observations, if one or more of your groups does not have enough observations, Stata will stop executing the command unless you specify this option. By enclosing a variable in parentheses you ensure that the data are in the correct order, but the command is executed only on citynum. Observation Indexing Observation indexing is one of Stata's coolest features. It is also one of its more esoteric features. As mentioned before, Stata numbers the observations in your dataset internally from 1 to N in the current sort order. This is an actual variable which you can use to like any other variable; it is called "_n". It is not saved with your data and you won't see it in your variable list, but you can create your own variable using it. Collapse and Contract The collapse and contract commands are used to create a new dataset from the one in memory by summarizing the data. In other words, collapse and contract will create a dataset of means, medians, and frequencies. Be aware that the newly created dataset will replace the one in memory, so be sure to save any changes you have made before executing the command. 24

25 collapse age income collapse age income, by(year) collapse (mean) age income (max) rand, by(year) collapse (mean) age income (max) max_age=age, by(year) collapse (mean) age income, by(year) cw contract employed mugged cityname contract emlpoyed mugged cityname, zero contract employed mugged cityname, nomiss Without specifying any options or statistics, Stata will produce a one-observation dataset of means. Most often, you will use the by option to produce statistics for each group. You can have several different types of statistics in your output dataset. By default, the variables in the new dataset will have the same name as those in your original dataset. If you want more than one statistic for a particular variable, then you must supply a new name for it: new name = original name By default, Stata will use all possible observations when calculating the statistics. This may result in the mean for age being based on a different number of observations than the mean for income. The cw option performs a case-wise deletion meaning that any observation that has a missing value for any variable specified will be dropped. Contract will create a dataset containing the frequencies of all the combinations of the listed variables. The zero option tells Stata to include any combinations with zero frequencies. The nomiss option tells Stata to not use any observation that has a missing value for any of the listed variables. General Considerations Always a good idea to have an initial record number for resorting back to original and for merging new data Collapse and Contract /* this file is run by stata at startup */ Some Usefule Commands to Include in Do files (or in your profile if running STATA on your own machine) 25

26 /* begin by setting memory */ set memory 600M /* maximize space set aside for matrix operations */ set matsize 800 /* make default for all lots txt instead of SMCL */ set logtype text /* set default to have the scroll (more) off */ set more off /* set scrollbuffer size */ set scrollbufsize Assignment Some good resources on the web ` 26

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

Introduction to Stata - Session 2

Introduction to Stata - Session 2 Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.

More information

Opening a Data File in SPSS. Defining Variables in SPSS

Opening a Data File in SPSS. Defining Variables in SPSS Opening a Data File in SPSS To open an existing SPSS file: 1. Click File Open Data. Go to the appropriate directory and find the name of the appropriate file. SPSS defaults to opening SPSS data files with

More information

Basics of Stata, Statistics 220 Last modified December 10, 1999.

Basics of Stata, Statistics 220 Last modified December 10, 1999. Basics of Stata, Statistics 220 Last modified December 10, 1999. 1 Accessing Stata 1.1 At USITE Using Stata on the USITE PCs: Stata is easily available from the Windows PCs at Harper and Crerar USITE.

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010 INTRODUCTION to Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010 While we are waiting Everyone who wishes to work along with the presentation should log onto

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Chapter One: Getting Started With IBM SPSS for Windows

Chapter One: Getting Started With IBM SPSS for Windows Chapter One: Getting Started With IBM SPSS for Windows Using Windows The Windows start-up screen should look something like Figure 1-1. Several standard desktop icons will always appear on start up. Note

More information

For many people, learning any new computer software can be an anxietyproducing

For many people, learning any new computer software can be an anxietyproducing 1 Getting to Know Stata 12 For many people, learning any new computer software can be an anxietyproducing task. When that computer program involves statistics, the stress level generally increases exponentially.

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

Introduction to SPSS

Introduction to SPSS Introduction to SPSS Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data file and calculate

More information

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

Introduction to Stata - Session 1

Introduction to Stata - Session 1 Introduction to Stata - Session 1 Simon, Hong based on Andrea Papini ECON 3150/4150, UiO January 15, 2018 1 / 33 Preparation Before we start Sit in teams of two Download the file auto.dta from the course

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 22.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software. Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

Using Microsoft Excel

Using Microsoft Excel Using Microsoft Excel Introduction This handout briefly outlines most of the basic uses and functions of Excel that we will be using in this course. Although Excel may be used for performing statistical

More information

GIS LAB 1. Basic GIS Operations with ArcGIS. Calculating Stream Lengths and Watershed Areas.

GIS LAB 1. Basic GIS Operations with ArcGIS. Calculating Stream Lengths and Watershed Areas. GIS LAB 1 Basic GIS Operations with ArcGIS. Calculating Stream Lengths and Watershed Areas. ArcGIS offers some advantages for novice users. The graphical user interface is similar to many Windows packages

More information

Handling Your Data in SPSS. Columns, and Labels, and Values... Oh My! The Structure of SPSS. You should think about SPSS as having three major parts.

Handling Your Data in SPSS. Columns, and Labels, and Values... Oh My! The Structure of SPSS. You should think about SPSS as having three major parts. Handling Your Data in SPSS Columns, and Labels, and Values... Oh My! You might think that simple intuition will guide you to a useful organization of your data. If you follow that path, you might find

More information

A quick introduction to STATA

A quick introduction to STATA A quick introduction to STATA Data files and other resources for the course book Introduction to Econometrics by Stock and Watson is available on: http://wps.aw.com/aw_stock_ie_3/178/45691/11696965.cw/index.html

More information

Chapter 11 Dealing With Data SPSS Tutorial

Chapter 11 Dealing With Data SPSS Tutorial Chapter 11 Dealing With Data SPSS Tutorial 1. Visit the student website at for this textbook at www.clowjames.net/students. 2. Download the following files: Chapter 11 Dealing with Data (SPSS data file)

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

2 The Stata user interface

2 The Stata user interface 2 The Stata user interface The windows This chapter introduces the core of Stata s interface: its main windows, its toolbar, its menus, and its dialogs. The five main windows are the Review, Results, Command,

More information

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to STATA ECON 330 Econometrics Prof. Lemke An Introduction to STATA ECON 330 Econometrics Prof. Lemke 1. GETTING STARTED A requirement of this class is that you become very comfortable with STATA, a leading statistical software package. You were

More information

Working with Mailbox Manager

Working with Mailbox Manager Working with Mailbox Manager A user guide for Mailbox Manager supporting the Message Storage Server component of the Avaya S3400 Message Server Mailbox Manager Version 5.0 February 2003 Copyright 2003

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

Chapter 6: Getting Data out of TntMPD

Chapter 6: Getting Data out of TntMPD Chapter 6: Getting Data out of TntMPD Overview One of the many great features of TntMPD is the ability to use your data in other programs. There are five primary ways of using your TntMPD database outside

More information

SAS Training Spring 2006

SAS Training Spring 2006 SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered

More information

An Introduction to Stata Part I: Data Management

An Introduction to Stata Part I: Data Management An Introduction to Stata Part I: Data Management Kerry L. Papps 1. Overview These two classes aim to give you the necessary skills to get started using Stata for empirical research. The first class will

More information

ECO375 Tutorial 1 Introduction to Stata

ECO375 Tutorial 1 Introduction to Stata ECO375 Tutorial 1 Introduction to Stata Matt Tudball University of Toronto Mississauga September 14, 2017 Matt Tudball (University of Toronto) ECO375H5 September 14, 2017 1 / 25 What Is Stata? Stata is

More information

UAccess ANALYTICS Next Steps: Working with Bins, Groups, and Calculated Items: Combining Data Your Way

UAccess ANALYTICS Next Steps: Working with Bins, Groups, and Calculated Items: Combining Data Your Way UAccess ANALYTICS Next Steps: Working with Bins, Groups, and Calculated Items: Arizona Board of Regents, 2014 THE UNIVERSITY OF ARIZONA created 02.07.2014 v.1.00 For information and permission to use our

More information

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation Subject index Symbols %fmt... 106 110 * abbreviation character... 374 377 * comment indicator...346 + combining strings... 124 125 - abbreviation character... 374 377.,.a,.b,...,.z missing values.. 130

More information

User Guide. Kronodoc Kronodoc Oy. Intelligent methods for process improvement and project execution

User Guide. Kronodoc Kronodoc Oy. Intelligent methods for process improvement and project execution User Guide Kronodoc 3.0 Intelligent methods for process improvement and project execution 2003 Kronodoc Oy 2 Table of Contents 1 User Guide 5 2 Information Structure in Kronodoc 6 3 Entering and Exiting

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 Revised September 2008 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab After having logged in you have to log

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 21.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

A QUICK INTRODUCTION TO STATA

A QUICK INTRODUCTION TO STATA A QUICK INTRODUCTION TO STATA This module provides a quick introduction to STATA. After completing this module you will be able to input data, save data, transform data, create basic tables, create basic

More information

After opening Stata for the first time: set scheme s1mono, permanently

After opening Stata for the first time: set scheme s1mono, permanently Stata 13 HELP Getting help Type help command (e.g., help regress). If you don't know the command name, type lookup topic (e.g., lookup regression). Email: tech-support@stata.com. Put your Stata serial

More information

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata...

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata... Introduction to Stata 2016-17 02. Getting Data into Stata 1. Enter Data: Create a New Data Set in Stata.... 2. Enter Data: How to Import an Excel Data Set.... 3. Import a Stata Data Set Directly from the

More information

GETTING DATA INTO THE PROGRAM

GETTING DATA INTO THE PROGRAM GETTING DATA INTO THE PROGRAM 1. Have a Stata dta dataset. Go to File then Open. OR Type use pathname in the command line. 2. Using a SAS or SPSS dataset. Use Stat Transfer. (Note: do not become dependent

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

Learning Worksheet Fundamentals

Learning Worksheet Fundamentals 1.1 LESSON 1 Learning Worksheet Fundamentals After completing this lesson, you will be able to: Create a workbook. Create a workbook from a template. Understand Microsoft Excel window elements. Select

More information

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction In this exercise, we will learn how to reorganize and reformat a data

More information

A Short Introduction to STATA

A Short Introduction to STATA A Short Introduction to STATA 1) Introduction: This session serves to link everyone from theoretical equations to tangible results under the amazing promise of Stata! Stata is a statistical package that

More information

1. Open the New American FactFinder using this link:

1. Open the New American FactFinder using this link: Exercises for Mapping and Using US Census Data MIT GIS Services, IAP 2012 More information, including a comparison of tools available through the MIT Libraries, can be found at: http://libraries.mit.edu/guides/types/census/tools-overview.html

More information

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September 2003 1 A brief introduction to Stata Starting the Stata program

More information

Microsoft Excel Level 2

Microsoft Excel Level 2 Microsoft Excel Level 2 Table of Contents Chapter 1 Working with Excel Templates... 5 What is a Template?... 5 I. Opening a Template... 5 II. Using a Template... 5 III. Creating a Template... 6 Chapter

More information

Empirical Asset Pricing

Empirical Asset Pricing Department of Mathematics and Statistics, University of Vaasa, Finland Texas A&M University, May June, 2013 As of May 17, 2013 Part I Stata Introduction 1 Stata Introduction Interface Commands Command

More information

Introduction to Qualtrics ITSC

Introduction to Qualtrics ITSC Introduction to Qualtrics ITSC August 2015 Contents A. General Information... 4 B. Login... 5 New Qualtrics User... 5 Existing Qualtrics User... 7 C. Navigating Qualtrics... 9 D. Create Survey... 10 Quick

More information

Word: Print Address Labels Using Mail Merge

Word: Print Address Labels Using Mail Merge Word: Print Address Labels Using Mail Merge No Typing! The Quick and Easy Way to Print Sheets of Address Labels Here at PC Knowledge for Seniors we re often asked how to print sticky address labels in

More information

Beyond 20/20. QuickStart Guide. Version 7.0, SP3

Beyond 20/20. QuickStart Guide. Version 7.0, SP3 Beyond 20/20 QuickStart Guide Version 7.0, SP3 Notice of Copyright Beyond 20/20 Desktop Browser Version 7.0, SP3 Copyright 1992-2006 Beyond 20/20 Inc. All rights reserved. This document forms part of the

More information

GETTING STARTED. A Step-by-Step Guide to Using MarketSight

GETTING STARTED. A Step-by-Step Guide to Using MarketSight GETTING STARTED A Step-by-Step Guide to Using MarketSight Analyze any dataset Run crosstabs Test statistical significance Create charts and dashboards Share results online Introduction MarketSight is a

More information

Introduction to Excel 2007

Introduction to Excel 2007 Introduction to Excel 2007 These documents are based on and developed from information published in the LTS Online Help Collection (www.uwec.edu/help) developed by the University of Wisconsin Eau Claire

More information

Introduction to Stata

Introduction to Stata Introduction to Stata Introduction In introductory biostatistics courses, you will use the Stata software to apply statistical concepts and practice analyses. Most of the commands you will need are available

More information

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step.

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. Table of Contents Just so you know: Things You Can t Do with Word... 1 Get Organized... 1 Create the

More information

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013 DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013 GETTING STARTED PAGE 02 Prerequisites What You Will Learn MORE TASKS IN MICROSOFT EXCEL PAGE 03 Cutting, Copying, and Pasting Data Basic Formulas Filling Data

More information

Lecture 2: Advanced data manipulation

Lecture 2: Advanced data manipulation Introduction to Stata- A. Chevalier Content of Lecture 2: Lecture 2: Advanced data manipulation -creating data -using dates -merging and appending datasets - wide and long -collapse 1 A] Creating data

More information

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Table of Contents Introduction!... 1 Part 1: Entering Data!... 2 1.a: Typing!... 2 1.b: Editing

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 HG Revised September 2011 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab and elsewhere at UiO. At the computer

More information

A Short Guide to Stata 10 for Windows

A Short Guide to Stata 10 for Windows A Short Guide to Stata 10 for Windows 1. Introduction 2 2. The Stata Environment 2 3. Where to get help 2 4. Opening and Saving Data 3 5. Importing Data 4 6. Data Manipulation 5 7. Descriptive Statistics

More information

DataExpress Frequently Asked Questions

DataExpress Frequently Asked Questions DataExpress Frequently Asked Questions Frequently Asked Questions (FAQs) are a problem-solving resource that you can use to answer your questions regarding database structure, DataExpress, and other reporting

More information

Oracle SQL. murach s. and PL/SQL TRAINING & REFERENCE. (Chapter 2)

Oracle SQL. murach s. and PL/SQL TRAINING & REFERENCE. (Chapter 2) TRAINING & REFERENCE murach s Oracle SQL and PL/SQL (Chapter 2) works with all versions through 11g Thanks for reviewing this chapter from Murach s Oracle SQL and PL/SQL. To see the expanded table of contents

More information

Adobe Acrobat 8 Professional Forms

Adobe Acrobat 8 Professional Forms Adobe Acrobat 8 Professional Forms Email: training@health.ufl.edu Web Site: http://training.health.ufl.edu 352 273 5051 This page intentionally left blank. 2 Table of Contents Forms... 2 Creating forms...

More information

Welcome to Introduction to Microsoft Excel 2010

Welcome to Introduction to Microsoft Excel 2010 Welcome to Introduction to Microsoft Excel 2010 2 Introduction to Excel 2010 What is Microsoft Office Excel 2010? Microsoft Office Excel is a powerful and easy-to-use spreadsheet application. If you are

More information

Using Microsoft Excel

Using Microsoft Excel About Excel Using Microsoft Excel What is a Spreadsheet? Microsoft Excel is a program that s used for creating spreadsheets. So what is a spreadsheet? Before personal computers were common, spreadsheet

More information

Part 2 Uploading and Working with WebCT's File Manager and Student Management INDEX

Part 2 Uploading and Working with WebCT's File Manager and Student Management INDEX Part 2 Uploading and Working with WebCT's File Manager and Student Management INDEX Uploading to and working with WebCT's File Manager... Page - 1 uploading files... Page - 3 My-Files... Page - 4 Unzipping

More information

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming Intro to Programming Unit 7 Intro to Programming 1 What is Programming? 1. Programming Languages 2. Markup vs. Programming 1. Introduction 2. Print Statement 3. Strings 4. Types and Values 5. Math Externals

More information

Getting Started With. A Step-by-Step Guide to Using WorldAPP Analytics to Analyze Survey Data, Create Charts, & Share Results Online

Getting Started With. A Step-by-Step Guide to Using WorldAPP Analytics to Analyze Survey Data, Create Charts, & Share Results Online Getting Started With A Step-by-Step Guide to Using WorldAPP Analytics to Analyze Survey, Create Charts, & Share Results Online Variables Crosstabs Charts PowerPoint Tables Introduction WorldAPP Analytics

More information

Teacher Guide. Edline -Teachers Guide Modified by Brevard Public Schools Revised 6/3/08

Teacher Guide. Edline -Teachers Guide Modified by Brevard Public Schools  Revised 6/3/08 Teacher Guide Teacher Guide EDLINE This guide was designed to give you quick instructions for the most common class-related tasks that you will perform while using Edline. Please refer to the online Help

More information

Microsoft Office Excel

Microsoft Office Excel Microsoft Office 2007 - Excel Help Click on the Microsoft Office Excel Help button in the top right corner. Type the desired word in the search box and then press the Enter key. Choose the desired topic

More information

CLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata

CLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata CLAREMONT MCKENNA COLLEGE Fletcher Jones Student Peer to Peer Technology Training Program Basic Statistics using Stata An Introduction to Stata A Comparison of Statistical Packages... 3 Opening Stata...

More information

Performing Basic Calculations

Performing Basic Calculations 7.1 LESSON 7 Performing Basic Calculations After completing this lesson, you will be able to: Build formulas. Copy formulas. Edit formulas. Use the SUM function and AutoSum. Use the Insert Function feature.

More information

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1 TOPIC: Getting Started with Stata Stata 10 Tutorial 1 DATA: auto1.raw and auto1.txt (two text-format data files) TASKS: Stata 10 Tutorial 1 is intended to introduce (or re-introduce) you to some of the

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

Chapter. Accessing Files and Folders MICROSOFT EXAM OBJECTIVES COVERED IN THIS CHAPTER

Chapter. Accessing Files and Folders MICROSOFT EXAM OBJECTIVES COVERED IN THIS CHAPTER Chapter 10 Accessing Files and Folders MICROSOFT EXAM OBJECTIVES COVERED IN THIS CHAPTER Monitor, manage, and troubleshoot access to files and folders. Configure, manage, and troubleshoot file compression

More information

tabulate varname [aw=weightvar]

tabulate varname [aw=weightvar] 1 Commands Introduced In this chapter you will learn these Stata basics: How to obtain information about a dataset How to obtain information about variables How to write and save a Do-file (a file that

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

Spreadsheet definition: Starting a New Excel Worksheet: Navigating Through an Excel Worksheet

Spreadsheet definition: Starting a New Excel Worksheet: Navigating Through an Excel Worksheet Copyright 1 99 Spreadsheet definition: A spreadsheet stores and manipulates data that lends itself to being stored in a table type format (e.g. Accounts, Science Experiments, Mathematical Trends, Statistics,

More information

MS Excel Henrico County Public Library. I. Tour of the Excel Window

MS Excel Henrico County Public Library. I. Tour of the Excel Window MS Excel 2013 I. Tour of the Excel Window Start Excel by double-clicking on the Excel icon on the desktop. Excel may also be opened by clicking on the Start button>all Programs>Microsoft Office>Excel.

More information

Introduction to Excel 2013

Introduction to Excel 2013 Introduction to Excel 2013 Copyright 2014, Software Application Training, West Chester University. A member of the Pennsylvania State Systems of Higher Education. No portion of this document may be reproduced

More information

BE Share. Microsoft Office SharePoint Server 2010 Basic Training Guide

BE Share. Microsoft Office SharePoint Server 2010 Basic Training Guide BE Share Microsoft Office SharePoint Server 2010 Basic Training Guide Site Contributor Table of Contents Table of Contents Connecting From Home... 2 Introduction to BE Share Sites... 3 Navigating SharePoint

More information

Exsys RuleBook Selector Tutorial. Copyright 2004 EXSYS Inc. All right reserved. Printed in the United States of America.

Exsys RuleBook Selector Tutorial. Copyright 2004 EXSYS Inc. All right reserved. Printed in the United States of America. Exsys RuleBook Selector Tutorial Copyright 2004 EXSYS Inc. All right reserved. Printed in the United States of America. This documentation, as well as the software described in it, is furnished under license

More information

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1 Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the SAI reports... 3 Copying, Pasting and Renaming Reports... 4 Creating and linking a report... 6 Auto e-mailing reports...

More information

Getting help with Edline 2. Edline basics 3. Displaying a class picture and description 6. Using the News box 7. Using the Calendar box 9

Getting help with Edline 2. Edline basics 3. Displaying a class picture and description 6. Using the News box 7. Using the Calendar box 9 Teacher Guide 1 Henry County Middle School EDLINE March 3, 2003 This guide gives you quick instructions for the most common class-related activities in Edline. Please refer to the online Help for additional

More information

5. Excel Fundamentals

5. Excel Fundamentals 5. Excel Fundamentals Excel is a software product that falls into the general category of spreadsheets. Excel is one of several spreadsheet products that you can run on your PC. Others include 1-2-3 and

More information

Instructions for Using the Databases

Instructions for Using the Databases Appendix D Instructions for Using the Databases Two sets of databases have been created for you if you choose to use the Documenting Our Work forms. One set is in Access and one set is in Excel. They are

More information

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA ABSTRACT The SAS system running in the Microsoft Windows environment contains a multitude of tools

More information

Scorebook Navigator. Stage 1 Independent Review User Manual Version

Scorebook Navigator. Stage 1 Independent Review User Manual Version Scorebook Navigator Stage 1 Independent Review User Manual Version 9.8.2010 TABLE OF CONTENTS Getting Started... 1 Browser Requirements... 1 Logging in... 2 Setting Up Your Account... 2 Find Your Scorebook...

More information

Rev. C 11/09/2010 Downers Grove Public Library Page 1 of 41

Rev. C 11/09/2010 Downers Grove Public Library Page 1 of 41 Table of Contents Objectives... 3 Introduction... 3 Excel Ribbon Components... 3 Office Button... 4 Quick Access Toolbar... 5 Excel Worksheet Components... 8 Navigating Through a Worksheet... 8 Making

More information

LEGENDplex Data Analysis Software Version 8 User Guide

LEGENDplex Data Analysis Software Version 8 User Guide LEGENDplex Data Analysis Software Version 8 User Guide Introduction Welcome to the user s guide for Version 8 of the LEGENDplex data analysis software for Windows based computers 1. This tutorial will

More information