Introduction to Stata

Size: px
Start display at page:

Download "Introduction to Stata"

Transcription

1 Workshop Introduction to Stata MSc Economics / MSc STREEM / AUC Aug 2017 Zichen Deng VU University Amsterdam

2 0 PREFACE GETTING STARTED Stata at VU University and AUC Start Memory Interactive mode and batch mode Log-files DO-FILE INGREDIENTS AND KEY COMMANDS Administrative commands Loading and viewing the data Generating and transforming variables Describing variables Saving the data Executing the do-file LEARNING TO HELP YOURSELF DIFFERENT TYPES OF VARIABLES Storage types Categorical variables (among which: dummy variables) Converting strings Working with dates Different data types in the Data Editor Missing observations/missing values

3 5 MORE ON SYNTAX Functions If-statements Loops By and bysort Recode Abbreviating variable names Macros Scalars GRAPHS Saving a graph (Overlaid) two-way graphs INSTALLING USER WRITTEN COMMANDS ECONOMETRIC ANALYSIS Correlation coefficient T-test of equal means Linear regression model Post-estimation commands Storing estimation results All estimation results in one table MISCELLANEOUS TOPICS Reading a dataset with a different format Combining datasets System variables and information Stata stores from statistical analysis: _n, _N, and (e)return list

4 9.4 Other useful commands: assert, capture, quietly STATA RESOURCES Books available in the VU library Online resources COMMAND OVERVIEW

5 0 Preface Stata is a statistical software with large versatility and enjoys widespread application in the international research community in economics and other social sciences. It has features both of a software package for data management and statistical/econometric work ( press the button and get results ), and of a programming language ( tell the computer exactly what you want to compute ). Learning to use it will pay off certainly in the long run, but may have immediate returns already in the first course of the MSc curriculum on Advanced Methods. Author of the present document is Jonneke Bolhaar (VU University Amsterdam and CPB Netherlands Bureau for Economic Policy Analysis, The Hague). It has been slightly revised by Stefan Hochguertel (VU University Amsterdam) and Zlata Tanović (VU University Amsterdam and Amsterdam Institute for International Development). Errors in this document may occur, apologies for them. If you encounter any, please let us know at zichen.deng@vu.nl. 1 Getting Started Stata is available for different computer systems (for Windows, the one that is used in this tutorial, and for Mac and Unix) and comes in 4 different types: Small Stata: has a maximum of 99 variables and 1200 observations. Stata/IC (intercooled Stata): regular version, up to 2047 variables. Stata/SE (special edition): for large datasets Stata/MP (multiprocessor): same as Stata/SE, but faster because it can use multiple processors at the same time to perform Stata commands. Stata comes in versions. StataCorp regularly releases a new version of the program. The latest version is Stata 14. Newer versions have additional capabilities, but you can use your programs in different versions of Stata. In general, changes between versions are documented and you can find out about differences between versions by typing help whatsnew 1.1 Stata at VU University and AUC For VU students, Stata/SE 14 is available in the computer rooms. The number of licenses is limited (university wide), but the number is large enough to hardly ever cause problems. It is good to know however, that when you try to access Stata and an error message appears 5

6 on your screen saying Stata cannot be opened, the cause may lie in the maximum number of licenses being in use at that moment. AUC students have a license for Small Stata Start Once you have started Stata, you will see a large window containing several smaller windows (Figure 1). review results variables command properties Figure 1: the different windows of Stata 6

7 The largest window is called Results window and will show the result from the analyses you perform. The window at the bottom is the Command window, where you tell Stata what you want it to do. All commands you have given Stata since you started your session are listed in the Review window. After you have loaded your dataset, the Variables window will contain a list of all variables. Stata 12/13 /14 looks slightly different from version 11 (Figure 1). A new Window is added to the interface, the Properties Window. You can manage the variables in your dataset directly from the Properties Window (e.g. add or modify a label, change the format of the variable. 1.3 Memory When you open Stata, it automatically assigns 50 MB of memory (only 10 MB in older versions! How much is assigned in the version you re working with is stated in the results window when you open Stata). This might be too little if you work with a large dataset or want to create many variables. To assign more memory to Stata (for example 100 MB), type set memory 100m You can only change the amount of memory that is assigned when there is currently nothing in Stata s memory. If you encounter memory problems while working, you first have to save your data and clear the memory with the command clear, before changing the amount of memory assigned. The maximum amount of memory that can be assigned depends on the internal memory of the PC you re working on. For Stata 12/13/14 you no longer have to set the memory yourself. Stata automatically takes care of it. 1.4 Interactive mode and batch mode Stata can be operated interactively or in batch mode. When you use Stata interactively, you type a Stata command in the Command window and hit the Return/Enter key on your keyboard. Stata executes the command and the results are displayed in the Results window. Then you enter the next command, Stata executes it, and so forth, until the analysis is complete. You can see the commands that you entered previously in the Review window and bring them to the command line again by a single click. In the same way you can bring the name of a variable from the list in the Variable window to the command line. Using one of the pull-down menus is another variant of using Stata in interactive mode. Stata executes the command that you specified in the dialog box. It also appears in the 7

8 Review pane and you can access it again by single-clicking on it, after which it will appear in the Command window. You can then edit it like a command that you entered yourself on the command line before you press Enter. This is a handy way of accessing commands that you are not yet familiar with, but is rather slow. When Stata is used in batch mode, all of the commands for the analysis are listed in a file, and Stata is told to read the file and execute all of the commands. Such a file with a series of commands is called a do-file by Stata and is saved using a.do suffix. Using Stata in batch mode has two important advantages over using Stata interactively. First, the do file provides an audit trail for your work. The file provides an exact record of each Stata command. This might not seem that important in the course where we will see only simple and short command sequences. But for serious applications later, like in your thesis, in scientific, consulting, or government work, reproducibility of results is a major issue. At any point in time, you should be able to reproduce your results from the original dataset. The order in which you manipulate variables and run regression commands may be very important. Second, even the best computer programmers will make typing or other errors when using Stata. When a command contains an error, it won t be executed by Stata, or worse, it will be executed but produce the wrong result. Following an error, it is often necessary to start the analysis from the beginning. If you are using Stata interactively, you must retype all of the commands. If you are using a do-file, then you only need to correct the command containing the error and rerun the file. You open a do-file by clicking: 1.5 Log-files A log-file is a file containing all the output of your program (which is everything that appears in the Results window), in a text-file. Log-files have the suffix.log. Storing your results in a log-file is useful when you want to be able to access the results of your analysis, without having to run the program again. For example, on a computer where Stata is not available (using Notepad to read the log-file). Or to take with you to a meeting with the supervisor of your thesis. In the next section you will learn how to create do-files and logfiles. 8

9 2 Do-file Ingredients and Key Commands A Stata do-file has four different kinds of commands or ingredients: 1. Administrative commands that tell Stata where to save results, how to manage computer memory, and so forth. 2. Commands that tell Stata to read and manage datasets. 3. Commands that tell Stata to modify existing variables or to create new variables. 4. Commands that tell Stata to carry out the statistical analysis. 5. Here is an example: A Stata.do file is nothing but a plain text file, and hence it can be edited in any editor (such as Notepad). Stata has its own editor (the aptly named do-file editor ), however, that also offers a few conveniences. To open this editor, type doedit <filename> in the command line. If the file <filename.do> exists, it will be opened (if not, an error message ( file not found ) will pop up). Without <filename>, the editor starts with a blank sheet. Note: If <filename> contains a space, then quotation marks should be added to the name, i.e. <filename>. It is good practice to always use quotation marks with path and file names. 9

10 The Stata editor automatically assigns different colors to different types of commands. Commands are things that Stata understands and it acts on them, it executes them. Very useful for any type of computer work is the stuff that should not be understood by a computer but rather by a human: comments. Use comments judiciously to document your work (so that you are able to retrace your steps when you look at things again after a while; comments also help others when they want to read or use your do-file). Use // or * to tell Stata that what follows on this line is a comment. If the comment you want to type stretches out over more than one line, use /* and */ to denote the start and end of the comment. We will now go through this small program and discuss its elements step-by-step. 2.1 Administrative commands The second line, cd H:\Documents\statafiles\ tells Stata which folder/directory is the one that we will be working from. All files will be used from and saved to this folder. This is convenient, because now in all commands that use or save a file, we only need to type the name and not its full path. Note that the directory is in quotation marks. In this case it is not strictly necessary, but it would be if there be a space in the path. For example writing H:\myfiles\stata files\ instead of H:\myfiles\stata files\ would give an error! The third line is a command that tells Stata where to write the log-file with the results of the analysis. To open a log-file called stata1.log in the current folder, use the command log using stata1.log, replace With replace you instruct Stata to replace any existing file with the same name in the same folder. For the course Advanced Methods, you usually have to hand in the do-file and log-file with your solution to the homework assignment. Don t forget to end your do-file with closing the log-file with the command log close. The command set more off tells Stata not to pause after displaying every new page of results. By default, Stata pauses every time the Results window is filled with new results and -more- is displayed at the bottom of the Results window. Execution of your program will only continue after you have pressed a key on the keyboard. As you are saving all results in a log-file, you may not find it necessary that Stata pauses after every page of results. By including set more off in the 10

11 beginning of your do-file you can get rid of this. (Why is this important? Sometimes you want Stata to do lengthy calculations while you go for a coffee. If it stops execution with a - more- it will not have made any progress when you come back from your coffee break.) 2.2 Loading and viewing the data Line 7 in our example do-file loads the dataset with the command use dataset_workshop.dta,clear The name of the dataset contains no spaces, so we don t need to use quotation marks. clear makes sure that if there was a dataset still in memory, it is cleared (note that this clears the memory without saving it first, so all unsaved changes in the dataset in memory will be lost by using clear!). describe in line 8 tells Stata to describe the dataset. This command produces a list of the variable names and any variable descriptions stored in the dataset. The latter are called variable labels. The list also contains the storage type (more about this in section 4.1), display format (not important for now), the name of the value label (more about value labels in section 4.2) and the variable label attached to the variable (more about variable labels in section 2.3). Another command that provides a lot of general information on your data is summarize. It gives a table with the number of non-missing observations and the mean, standard deviation, minimum and maximum for a variable. If you use the summarize command without a list of variables, Stata produces summary statistics for all variables in the dataset. 11

12 A look at the output generated by the summarize command for our dataset shows that this command only works for numerical variables. For the two string variables ( zdate and sex ) the table is empty except indicating that there are zero observations for this variable. In this case, zero observations means zero numerical observations. Both variables do actually have observations, but the variables are string variables ( strings is nonnumerical text, but even numbers can be stored as text, e.g. 99 ). Keep in mind when using this command that it has no relevance for string variables. The command tabstat is a more advanced version of summarize. For example: tabstat yearb, stats(mean N) will show the average of the yearb variable, as well as the number of observations for which this variable has (non-missing) observations. Type help tabstat to see the list of statistics that can be shown, as well as explanations for other useful options such as by(). If you want to have a look at the real data, for example to see whether everything went fine with loading the data, you can open the Data Editor by clicking one of these buttons: 12

13 The left button opens the Data Editor in the edit mode, the right button opens it in the browse mode. In the edit mode you can change the data (by clicking on a cell or column), in browse mode you can t. The Data Editor will open in a new window. 1 As with most commands in Stata, you can also open the Data Editor with a command in the Command line or in your do-file. The command to open the Data Editor in the browse of edit mode is, respectively, browse or edit. This has the advantage that you can select which variables you want to be displayed in the Data Editor. For example, browse id yearb monb dayb opens the Data Editor with only these three variables. From Stata 12, the new Properties Window is also part of the Data Editor. In addition, a special version of the Variables Window where you can select which variables are shown in the Editor is part of the Data Editor. 1 In Stata 9 (and all older versions), you cannot use the Command line or run a do-file while the Data Editor is open (they are blocked automatically). So first close it before you continue (minimizing it is not enough!). In newer versions of Stata, you can leave the Data Editor opened while entering commands in the Command line or running a do-file. 13

14 2.3 Generating and transforming variables In line 12 of our program we generate a new variable with the generate command. Other much used commands to transform data are replace, which modifies an existing variable: replace var1 = 0 to replace the value of var1 by 0 for all observations. rename, which changes the name of a variable: rename oldname newname changes the name of variable oldname into newname drop, which drops a variable: drop var1 var2 drops the variables with the names var1 and var2 keep, which keeps the listed variables: keep var1 var2 only keep the variables var1 and var2 and drops all others The name of a variable may not contain blanks and is case-sensitive. Try to keep names short and clear. The maximum number of characters for a variable name is 32, but Stata prints only 12 in the output of many commands (for example regression results). You can attach a label to the variable of maximum 80 characters. You can use the label to give a more precise description of the variable. This is how you create a label for the variable intervdate that says date of interview : label variable intervdate date of interview Stata uses this label whenever you make a table or graph instead of the variable name. 2.4 Describing variables To describe one of the variables in the data in a frequency table, you can use the command tabulate, as we did in line 16 of the do-file. The table will appear in the Result Window: 14

15 tabulate can also be used to make a cross-tab if you put two variable names after the command. For example tabulate children sex will create a table with two columns: one with frequencies for the variable children for males and one with the frequencies for females: 2.5 Saving the data In line 23 of our program we save the altered (because we generated a new variable called wagegr_year) dataset. For this, we use the command save dataset_workshop.dta, replace The option replace is used to overwrite any existing dataset with this name. If there is no existing file with this name, a new file is created. There is one thing you should take care of though, if you want to work in more than one version: while version 8 and 9 have no problem understanding datasets created by one of these versions and the same holds for versions 10 to 12, you will encounter problems if you try to load a dataset created in version 10 or above into version 9. The solution though is simple: the command saveold saves a version of your datatset created in version 10 or above that can be read by version 9 or below. In Stata 13, saveold will save a version of the data that can be read by versions To save the data in another format, e.g. Excel, use Stata s export command. For example: export excel using dataset_workshop.xls, firstrow(variables) replace The dataset has now been saved as an.xls file, such that the variable names (not variable labels) have been saved in the first row. See section 9.1 for an explanation how Stata can import.xls files. 15

16 2.6 Executing the do-file There are two ways to execute your do-file, and for each of them there is a button in the dofile editor. The right button executes your do-file normally and will show the results in the Results window. Clicking this button is the same as typing in the Command line do stata1.do Alternatively, you can click on the File menu, then Do, and then select the file C:/myfiles/statafiles/stata1.do. This will also run the do file. The left button executes your do-file quietly and is equal to typing run stata1.do in the Command line. Running a do-file quietly implies that the results will not be displayed in the Results window, and will also not appear in the log-file. If you want to execute only some lines of your do-file (for example because you want to add some lines to an existing do-file that you already ran and stored earlier), you can do so by selecting the lines you want to execute and click on the do or the run button. 3 Learning to Help Yourself Using a program like Stata, you will frequently encounter situations where you either don t know what a particular command is doing exactly or where you don t know how to perform a particular analysis in Stata. There are a variety of possibilities for moving on in a situation like this. If you know the command, it is useful to start with the built-in help of Stata. Stata has detailed help files available for all Stata commands. You can access these by selecting Stata Command from the Help drop-down menu, and enter the command in the window that pop up. You can also just type help <command_name> at the command line. Similar pages can be accessed at Stata s website, and doing a google search stata help <command_name> usually also gets you there quickly. Stata commands are described in detail in the Stata Users Guide and Reference Manual. In Stata 11-13, the built-in Help in 16

17 Stata contains also a link on the end of each lemma (under Also see ) to relevant pages in these manuals. Clicking will open a pdf of the manual on the right page. 2 If you know what you want to do but don t know the exact Stata command, there are two things you can do. First, you can select Search.. from the Help drop-down menu and type in a keyword (for example, the name of an estimation method). Second, it is likely that a Google search will also get you to the information you are looking for quickly. Stata is widely used and you are probably not the first one looking for a command to perform a particular action. 2 A paper version of the manual may be available at the IT help desk. 17

18 In the example above you see the help file for the tabulate command. The syntax of the command is always built up in a similar manner. Here, the syntax for a two-way table is tabulate varname1 varname2 [if] [in] [weight] [, options] The Stata command (here: tabulate) itself is always in bold type. The underlined part of the command, tab, is the way a command may be abbreviated. Ingredients of the command are displayed in italics. Necessary ingredients appear without square brackets and ingredients that are not strictly necessary appear between square brackets. Here, varname1 varname2 in the help file teaches us that for a two-way table, we need to include two variable names. [if] [in] [weight] tells us that if we want to, we can add an if-statement, an in-statement or add variable weights. We should add this immediately after the variable names and before the comma if we want to use one of these statements. Finally, [, options] indicates that if we want to use any of the options available for this command, we should place them after a comma. Almost every command has several options to adapt the command. The options are listed and explained in the help file (you find the explanation for each option if you scroll down in the help file). tabulate has, for example, the option row. With this option, the two-way table that is produced does not only contain frequencies, but also the relative frequency within a particular row. If we want a two-way table that contains only these relative frequencies per row and does not contain the frequencies, we not only add the option row, but also the option nofreq : The blue colored words in the syntax indicate that there is a separate entry available for this. Clicking for example on the word weight will redirect you to the Help-file on using weights. Note that all the brackets that appear in the command description, are not to be included in the syntax! They are merely there to indicate the required and optional parts of the command. 18

19 Here is another example, the entry of the command destring: The syntax for the command destring is destring [varlist], {generate(newvarlist) replace} [destring_options] Again, the bold words are the part of the syntax is always required. There is one difference here, however. The boldface words generate and replace are captured in curly brackets and separated by a vertical line. This means that the syntax for the command destring requires either generate or replace to be specified. All the available options are specified at the lower part of the screen. These are explained in more detail if you scroll down the Help-file. 19

20 4 Different Types of Variables 4.1 Storage types In section 2.2 we already came across the term storage type when we discussed the output describe generates. Stata has six different types of data (storage types), of which five for numbers (numeric data) and one for text (string data). The five numeric data types are called byte, int, long, float and double. What are the differences between these five types? First, byte, int, and long can only hold integers (i.e. no decimals). float and double can also hold non-integers. Second, the precision of each type differs. In the table below is the minimum and maximum value a variable of each type can take. Larger or smaller values will result in a missing observation, denoted with. in Stata. minimum maximum closest to 0, but not 0 Byte / -1 Int -32,767 32,740 1 / -1 Long -2,147,483,647 2,147,483,620 1 / -1 float *10^ *10^38 10^-38 / -10^-38 double *10^ *10^307 10^323 / -10^-323 A variable that contains text is stored as a string variable, or str#. On the place of the # is the maximum number of characters the string can contain. This can be anything from 1 to 244. If you try to fit more characters than the data type allows, every character beyond # is ignored. The default data type for numeric data when you generate a variable is float. If you want to generate another type, place the name of the type between the generate command and the name of the variable to generate. The default data type is fine for most purposes, but there are cases where the default is problematic. One case is identification numbers. A float has 7 digits of accuracy, and will therefore round an identification number with 8 or more digits. The preserve the identification as it is, use long for 8 or 9 digits and double for up to 16 digits of accuracy. Precision is often important to people doing numerical work, and a little reading on numerical issues will tell you that computers cannot uniformly handle all numbers that humans typically deal with in the same way. For instance, did you know that the decimal number 0.1 has no finite precision representation in binary floating-point arithmetic? 20

21 (See, e.g., Wikipedia s entry on floating point for details). This property can occasionally lead to unexpected results. Changing data type from float to double can partly address such issues. However, it is not necessarily a good idea to store everything as double, because double eats up lots of memory. To strike a balance, you can ask Stata to convert numerical storage types to the lowest level without losing significant information by typing compress [varlist] 4.2 Categorical variables (among which: dummy variables) Categorical variables are variables where the value of the variable has not the meaning of a normal number, but where each value stands for a category. An example is variable za1 in our dataset, that equals 1 if the answer is yes and 2 if the answer is no to the question Do you work at least 15 hours per week?. Instead of working with 1 and 2, we can attach a value label to the variable. The value label contains the information about the meaning of every number of the categorical variable. When you ask Stata to make, for example, a frequency table it will use the text you attached to each number instead of the numbers. The value labels are also used in the Data Editor. To create a value label with the name yesno that contains the information that 1 means yes and 2 means no, you type: label define yesno 1 yes 2 no and to assign this label to the variable za1: label values employment yesno Use label list to get a list with the names and content of all value labels in the dataset. You can modify an existing label using the command label define and one of the options add, modify or replace after the comma. Although the label instead of the number appears in tables and graphs, you can keep using the number in if-statements etc. Dummy variables are special cases of categorical variables with binary information (i.e., they can take two values). Dummy variables are very often used in econometric work, and here they typically take values 0 and 1. Often, a categorical variable that has more than 2 discrete values is split into a set of binary dummy variables. For instance, suppose you have a variable color taking values 1, 2, 3 for blue, red, and green. You might want to split this into two dummy variables, one for blue (values 1 for blue and 0 for non-blue), and one for red (values 1 for red and 0 for non-red) the remainder then is green by implication (i.e., observations for which both blue and red have value 0). 21

22 One quick way to convert a categorical variable with many discrete values into a set of dummy variables is afforded by the tabulate command and issuing the option generate, as in tabulate color, generate(dumclr) This will first give you the usual tabulation of color, and then make a number of dummy variables called DumClr1, DumClr2, etc. flagging the different color values; these newly created variables are all of storage type byte and come automatically with variable labels. In addition, this way of creating dummy variables also properly deals with missing values. (More on missing values: below). 4.3 Converting strings Your dataset may contain string variables that only include numbers. As numerical variables can be included in a regression, but string variables cannot, you may want to convert the data in this string variable from string to numeric. For this purpose you can use the command destring. The complete syntax to create a new, numeric, variable called new_var1 that contains the converted content of the string variable var1 is destring var1, generate(new_var1) Instead of the option to generate a new variable, we could also have chosen to overwrite the content of the existing variable by using replace instead of generate(new_var1). If you want to perform exactly the opposite action, converting a numerical variable into a string, you can use tostring. Another possibility is that the dataset contains a string variable with a limited number of different texts. For example, the string variable may contain only yes or no, or it may contain one of ratings very good, good, reasonable, bad, very bad. To create a categorical variable from this string, you can use the command encode. If var2 contains yes or no for each observation, encode var2, generate(new_var2) creates a new categorical variable named new_var2. The labels created for this variable are stored in a label with the same name as the new variable. If you want to use an existing label for the new categorical variable, add label(labelname) to the syntax. 4.4 Working with dates Working with dates is one case where older and newer versions of Stata differ. The method described here applies to version 11 and up. 22

23 Stata has a special format to store dates. The advantage of using this format is that it understands things like 1feb2012 minus one day is 31jan2012. Usually when a dataset is supplied to you, dates are recorded as a string variable. To create a new variable dob that converts the variable dateofbirth (containing dates that look like ) to the date format, you type generate dob = date(dateofbirth, DMY ) format %td The second element in the date command, DMY, indicates how the dates are built up. In this case it is day-month-year. The format for daily dates is %td. It is no problem if the original string variable with dates had hyphens or slashes to separate day, month and year. Stata will ignore them. What date essentially does is creating an integer with the number of days (for daily data, or weeks for weekly data, etc.) since January 1, 1960 (it is a negative number for dates earlier than January 1, 1960). With format %td you tell Stata how to interpret this number. %td means it should interpret it as days since January 1, 1960, %tw means it should be interpreted as weeks since the first week of 1960, etc. In the Data Editor and in tables and graphs, Stata will display for example 16jan1983 if %td is specified and 1983w3 if %tw is specified. You can now easily generate a variable that contains the duration between two dates: will do just that. generate duration = datevar1 - datevar2 4.5 Different data types in the Data Editor The Data Editor uses different colors for different types of variables. Numerical variables that are not categorical and variable in date format are displayed in black. Categorical numeric variables are displayed in blue. The Data Editor displays the name assigned to the number for categorical variables. If you want to open the Data Editor showing numbers instead of the labels assigned to them, use browse, nolabel. String variables are displayed in red. 4.6 Missing observations/missing values Missing values deserve special attention because they can surface in unexpected situations and may lead to unexpected results. Simply put, a missing value is a value that is not or should not be there. Let s start with a simple example of a numerical missing value that is kept in the data as a dot:. Consider the following story. An interviewer asks respondents if they own a car, and 23

24 the ones that answer yes (value 1) get a follow-up question on the color of their car blue (value 1), green (value 2), and red (value 3). So our data set may look like person havecar carcolor Person 3 has no car and therefore gets a missing value for carcolor. The symbol used for the missing value is a so-called system missing value that tells Stata how to handle it. In real data, one may encounter all kinds of values that are actually missing but not necessarily recognizable as such without further documentation. For instance, we may have values such as -9 or 9999 or 97, or any other number. This may matter because depending on what value is being used, the rules of arithmetic may deliver very different results. Suppose, we were to add up the variables havecar and carcolor, using generate nonsense = (1-havecar) + 2*carcolor then the variable nonsense in our data set would take values 2, 4,., 6 for the four persons; that is, adding (or multiplying, etc.) a system missing value to any numerical value delivers a system missing. Had our missing value been 99 instead, we would have obtained 2, 4, 199, 6, and we would have no way of telling that 199 is in fact the result of a missing value operation. It is therefore good practice to closely inspect all variables in a data set for special values that may in fact be missings. With some luck, there is documentation on all values of variables in the data set, telling that, e.g., 998 is not actually 1000 minus 2, but rather something else (for instance a code for not applicable ). With less luck, you need to figure this out yourself. Remark. Stata has next to system missing other types of missings that are being treated arithmetically in the same way. These are extended missing values that have the codes.a,.b,.c,,.z. There are not many instances where these extended missing values are actually being used in practice, but they may come in handy to convey different meanings. For instance,.a may indicate I don t know,.b may mean I do not want to say (refusal), and.n may mean question not asked to respondent. So, the way Stata handles missings is quite convenient. However, you need to know that system missings are internally handled as if they represented the value infinity (indeed, infinity plus or times something else also results in infinity). This can be confusing when you want to use operators on variables that contain missings. Consider the following 24

25 example in which the variable nonblue is supposed to flag all cars that are not blue (blue was value 1): generate nonblue = carcolor>1 This will result in a new variable that has two values: 1 if the expression carcolor>1 is true, and 0 if it is not true. In the data, nonblue equals 0 for person 1 and 1 for persons 2, 3, and 4. However, person 3 has no car, its carcolor is missing, and yet our definition has assigned value 1 to variable nonblue. We could now go on and do calculations for person 3 using variable nonblue even though the original variable carcolor would not allow us to do calculations for that observation. Warning: this example illustrates a common mistake; the danger is that missing values disappear from the data. This behavior may be unintended, although it follows from the logic that the missing value is treated as infinity (that is, the expression carcolor>1 is true for person 3 since Stata reads it as infinity>1 ). In order to fix this, we have to be more explicit: generate nonblue = carcolor>1 if carcolor!=. The condition if carcolor!=. means that the expression carcolor>1 is only evaluated if carcolor!=. is true. For person 3 it is false, and the expression is not being evaluated. In that case, the new variable gets value system missing. The story so far applied to numerical missing values. In the Data Editor, numerical missing observations are also indicated by a dot. Missing string variables have a completely empty field: missing values of a string variable are equal to (double quotation marks with nothing in between). You can also use these representations in expressions, e.g. generate myvar= would generate a string variable with missing values only. Important note: In regressions, observations that have a missing value for any one of the specified variables are NOT taken into account in the estimation! Adding one variable with quite some missings to your model, may therefore dramatically decrease your sample size and hence affect the results. This is particularly true if your regression model has many variables, each of them having many missing values. 25

26 5 More on Syntax 5.1 Functions Stata has many built-in functions. Using the keyword functions in the help file gives an overview of the different types of functions: Clicking on one of the words in blue, redirects you to a list of all functions Stata has in this category. For example, selecting math functions gives a list of commands related to mathematical functions. Here we can find, for example, the syntax needed to create a new variable var2 that equals e to the power var1, rounded to the nearest integer: generate var2 = round( exp(var1) ) 5.2 If-statements You can use an if-statement if you want to generate a variable or perform an analysis only for a subset of the data. For example, generate x=1 if employment==1 will create a variable x that equals 1 if employment equals 1 and is missing otherwise. If there is more than one condition to be satisfied, use & between the conditions: generate x=1 if employment==1 & sex== F Note that with string variables you always have to use quotation marks. The relational and logical operators you can use in Stata are: 26

27 == equal to!= not equal to (same as ~=) ~= not equal to (same as!=) > greater than >= greater than or equal to < less than <= less than or equal to & and or Note that a condition of the type some variable is equal to.. requires a double equality sign! Instead, in an expression like generate x=1 only a single equality sign is needed. The difference is that the same symbol = carries different meanings: generate x=1 should be read as: value 1 is being assigned to the new variable x ; generate y=(x==1) should be read as the result of the assertion x is equal to 1 is being assigned to the new variable y, where x is equal to 1 can be either true (value 1) or not true (value 0). 5.3 Loops Sometimes you want to perform almost the same commands many times. For example, you want to generate a separate variable for each category of a categorical variable. Instead of typing the same commands over and over again, you can use a loop. Suppose your categorical variable is called wagecat and has 5 categories, 1 to 5. forvalues i=1(1)5 { } generate x`i = wagecat==`i will generate 5 new variables: x1, x2, x3, x4, and x5. What Stata actually interprets is the following as it goes through the loop in 5 rounds: i=1 generate x1=wagecat==1 i=2 generate x2=wagecat==2 i=3 generate x3=wagecat==3 i=4 generate x4=wagecat==4 27

28 i=5 generate x5=wagecat==5 There are a few important remarks to make about this syntax. First, note that the single quotation mark ` before the i is different from the single quotation mark after the i. The command will not work if you have these quotation marks incorrect. (i is actually a local macro, on which more below). Second, the number between round brackets denotes the size of the steps to take when going from 1 to 5. (If we had coded i=0(100)500 we would have stepped from 0, 100, 200,, 500.) Third, you can choose any other symbol (or even a word) instead of i. Forth, syntax requires that the curly open brace { is on the same line as forvalues, not followed by anything executable on the same line (line break required, unless only comments follow), and the curly closing brace } is on a line of its own. The forvalues command only works for numerical values. To make a loop over a list of text or other objects instead of numbers we use a different command, foreach : foreach var in x1 x2 x3 x4 x5 { } replace `var =. if `var ==0 The same remarks as before for forvalues apply. So-called while and if/else loops can be programmed as well, see help while and help ifcmd in Stata. 5.4 By and bysort When you want to perform the same Stata commands on a number of subsets of the data, the by command can be helpful. bysort sex sector: summarize hours will create summary statistics for the variable hours for each combination of the variables sex and sector. To use by, the data must be sorted by the variables that determine the subgroups (here sex and sector ). If the data are not yet sorted, you need to specify that by using bysort instead of by. Alternatively, you can sort them first with the command sort sex sector, and subsequently use by. Not all Stata commands allow that they are use in combination with by or bysort. The help-files indicate in each lemma whether the command can be used in combination with bysort. 28

29 5.5 Recode When you want to generate a categorical variable from a discrete or continuous variable, the command recode can save a lot of typing and if-statements. Suppose we have a variable called age that contains the age of the observed individuals and we want to create 5 categories: 0-18, 19-25, 26-40,41-65, 66+. recode age (0/18=1) (19/25=2) (26/40=3) (41/65=4) (66/max=5),generate(agecat) creates a variable called agecat with the appropriate value for each observation. max and min can be used if you do not know the maximum or minimum value the variable takes. The example above works fine if age is a discrete variable. If age is a continuous variable, it is better to use recode age (0/18=1) (18/25=2) (25/40=3) (40/65=4) (65/max=5),generate(agecat) which refers to the categories 0<age<=18, 18<age<=25, etc. 5.6 Abbreviating variable names In section 3 when we discussed the syntax of tabulate, we already saw that commands can be abbreviated (tabulate could be abbreviated by tab for example). But the amount of typing can be reduced even further, by also abbreviating variable names. These are Stata s rules for abbreviating variable names: grinc* are all variables that start with grinc, so in our example grinc* stands for: grinc_wage grinc_sempl grinc_sw grinc_pens *b are all variables that end with b, so in our example *b stands for: yearb monb dayb startjob satisf_job s*b are all variables that start with s and end with b, with any number of characters in between. So in our example s*b stands for: startjob satisf_job y~b is a variable that starts with y and ends with b, with any number of characters in between. However, in contrast to using *, ~ refers to a single variable. If more than one variable matches the description, you will get an error message. In our example y~b stands for: yearb grinc_wage-grinc_pens are all variables from the list with variable names between grinc_wage and grinc_pens, so in our example grinc_wage-grinc_pens stands for: grinc_wage mainactivity grinc_sempl grinc_sw grinc_pens 5.7 Macros Macros are underappreciated and misconceived essential parts of Stata. Basically, they are just bits of text or numbers that can be referred to. But they can also be manipulated and 29

30 that makes them extremely versatile and useful. If you are new to Stata and you see yourself being confronted with macros, it may take a little to get used to them. Macros come in two guises: local and global macros. Let us start with a local macro. We have seen one already in the code: forvalues i=1(1)5 { } generate x`i =wagecat==`i We mentioned that i is a local macro. It is in some sense like a variable since it holds a value, but it is not listed among the variables, and it does not have observations (but just a single, scalar value). In the code above, local i takes values 1,2,, 5. The use of single quotation marks around it actually retrieves the current value: `i' results in 1 in the first round, 2 in the second round, etc. 5 in the fifth round. The first line is assigns particular values to the local macro, the second line chucks out the value of the macro (in two places: once in the definition of a variable name, and once in the evaluation of a logical expression). There are other ways of achieving the same goal, when explicitly using the local macro syntax. We can alternatively code local i=1 while `i'<=5 { } generate x`i' = wagecat==`i' local i=`i'+1 to obtain the same result. Note the last line: here, the content of the local macro called i is being overwritten, i is being reassigned the result (value) of the calculation `i'+1, where `i' is the currently known value; after the assignment has been concluded the value has been updated (with 1). Next to local macros there are global macros. They pretty much do the same thing, although they are more frequently encountered as holding strings rather than numerical values. The main distinguishing feature is that their syntax looks a little different, in particular as retrieval of values is concerned. So, we could say global i=1 generate x$i=wagecat==$i That is, we use a dollar-sign prefixed to the global s name if we are to retrieve its value. We can rewrite our while loop using globals, but we cannot rewrite a foreach or forvalues loop using globals. Globals are often used to replay text. Here is an application. Suppose you run a large number of regressions, each with different options or on different samples, but 30

31 all with the same specification. You can collect your variables in globals and simply refer to those rather than retype all your variable lists: Instead of typing you could type regress wage female nkids black south eduhigh edulow if region==1 regress wage female nkids black south eduhigh edulow if region==2 regress wage female nkids black south eduhigh edulow if region==3 global yvar wage global xvars female nkids black south eduhigh edulow forvalues i=1(1)3 { } regress $yvar $xvars if region==`i next time you discover you want to change your specification and replace nkids with nsons ndaughters, you only need to change the line global xvars female nsons ndaughters black south eduhigh edulow The content of macros can be listed and displayed using macro list or macro dir Local macros will show up in the table having a leading underscore, as in _i. In addition, the listing will also show so-called system macros (defined by Stata, not by the user). 5.8 Scalars A scalar is an element that has one value. For example typing: count scalar num = r(n) display num Saves the output of the count command, i.e. number of observations, into num (a scalar). Typing display num then shows you what num is exual to. Note the difference between a scalar and a variable: the latter is a column with one value for every observation, whereas a scalar has one value only. For more information, see help scalar. 31

32 6 Graphs For simple graphs (a scatter plot, histogram, regression line, etc. in standard lay-out), using the command line is the easiest way to go. If you want to customize a graph, for example because you want to use it in your thesis, you have two options. First, you can go to Stata s help-file and type graph. This gives an extensive description of all possibilities. But, especially if you just started to work with Stata, there is an easier alternative. With a click on Graphics, a dropdown menu will open (see left part of Figure 2) with a list of different types of graphs. If you click on, for example, Twoway Graphs (scatter, line, etc.) opens a new window (see right part of Figure 2) in which you can simply select the type of graph, the variables to be used and many lay-out parameters, like the color of the line (or dots), the symbol used for a scatter diagram, titles of the axes, range of the axes, etc. After you ve clicked OK Stata will write down the correct syntax and create the graph. By looking at the syntax that Stata prints in the main window, you will learn how the syntax for a customized graph is build up. This graph creator is somewhat slower than using the command line. Figure 2: creating graphs from the menu 32

33 Graphs always open in a separate window. This also implies that they will not appear in your log-file! For example, histogram hours, start(0) width(2) produces the following graph with bar width 2 and starting point for the bars equal to 0 : 6.1 Saving a graph If you want to save a graph that you have created, using the command graph save histogram will save the graph under the name histogram.gph. Stata s standard format for graphs is.gph. If you want to save your graph in a different format, you can choose from the available formats.ps,.eps,.wmf,.emf,.png and.tif. The command to save a graph in one of these format is graph export instead of graph save. The command graph export histogram.tif, replace saves my histogram as a.tif file and replaces any existing graph called histogram.tif in the folder I m working in. 6.2 (Overlaid) two-way graphs A graph that shows the relation between two variables is called a two-way graph in Stata. An example of such a graph is a scatter plot: graph twoway scatter hours grinc_wage creates a scatter plot of hours work per week on the y-axis and gross wage on the x-axis (see left panel of Figure 3). 33

34 Creating one graph that contains the result of merging two two-way graphs is called an overlaid two-way graph in Stata. For example, you might want to make a graph that contains both a scatter plot of two variables X and Y ánd the regression line of regressing X on Y. The command graph twoway (scatter hours grinc_wage) (lfit hours grinc_wage) creates the scatter plot we made before, and on top of it the regression line of hours on grinc_wage (see right panel of Figure 3). You can customize both layers as much as you want. All commands for the first layer (the scatter plot) are between the first set of brackets, all commands for the second layer (the fitted regression line) are between the second set of brackets. Figure 3: two-way graphs 7 Installing user written commands Stata has a large array of built-in commands, but sometimes it is useful to be able to perform user written commands that are not standard in Stata. These have been written by other users (in so-called.ado files) and shared. These programs can be installed easily by typing the ssc install <commandame> (possibly adding the option, replace if the program has been installed previously) in Stata s command window. There are a lot of user written commands available. To know which user written commands are most widely used by other Stata users, type ssc hot in the command window. Important: At the university computers it is not permitted to write on the C:\ drive. To be able to install user written commands, you will have to instruct Stata to install files on your personal drive ( H:\ ). This can be done as follows. Before writing ssc install <commandame>, type: 34

35 sysdir set UPDATES "<DIRECTORY>" sysdir set PLUS "<DIRECTORY>" Where <DIRECTORY>" is the place on your personal drive where Stata will store all installation files, e.g. H:\Documents\Mystatafolder\. After installing the package, the corresponding help file will be available in Stata as well. In the next section you will see a helpful user written program called estout. 8 Econometric analysis In the end, the reason you are using Stata is that you want to do econometric analysis. There a multiple ways to do econometric analysis with Stata. The first way is by writing a do-file with all the commands for your analysis and run it. Second, you could also use the command line. And third, Stata also offers statistical analysis from a drop-down menu. To use this last option, click on Statistics and choose one of the methods of analysis from the drop-down menu (see Figure 4). Figure 4: the drop-down menu for Statistics This will open a separate window where you can specify the exact model. In this tutorial we will focus on the first method, writing a do-file with all necessary commands. 35

36 8.1 Correlation coefficient If you are interested in the correlation coefficient or correlation matrix between two or more variables, you can use the commands correlate or pwcorr. Both commands are similar, but there is for example an extra option available only with correlate that enables you to show the covariance matrix instead of the correlation matrix, namely the option covariance. With pwcorr command you can also see the p-value that corresponds to null hypothesis of zero correlation, by specifying the extra option sig, like this: pwcorr hours grinc_wage, sig The output is shown below. Note: if this command gives you an error, then the reason is probably because hours is a string variable instead of a numerical one. You will first need to make it into a numerical variable with the command destring hours, replace. The correlation table shows that the correlation between the variables hours and grinc_wage is The p-value that corresponds to the null hypothesis that the correlation is zero is equal to Important note 1: If you find that two variables are statistically significantly correlated in your dataset, this doesn t mean that their correlation is nonzero in the whole population. Characteristics of small or unrepresentative samples do not necessarily reflect characteristics of the population. Important note 2: If two variables are statistically significantly correlated, this does not mean that one causes the other, i.e. correlation does not imply causation! Example: the number of ice-creams eaten per month correlates positively with the number of drownings per month. Does eating ice-cream cause drownings, or the other way around? Or is there another reason (so-called omitted variable) causing both to increase at the same time? 36

37 8.2 T-test of equal means If you want to test whether the mean of a variable (statistically significantly) differs between two subgroups, you can use the ttest command. For example: ttest grinc_wage, by(sector) tests whether the mean gross weekly wage (variable grinc_wage ) differs between the private and public sector (variable sector ). The resulting Stata output is: Here the null hypothesis is that of equal means (H0: diff = 0). The p-values for three alternative hypotheses Ha stands for alternative (a) hypothesis (H) are given in the final two rows. The corresponding t-statistic is Note: The standard assumption of ttest is that the variances of the two group-means are equal. If you have reason to believe that the variances are unequal, then add unequal at the end of the command (after the comma). 8.3 Linear regression model The command for OLS regression in Stata is regress. regress hours grinc_wage children runs a linear regression with hours (hours worked per week) as the dependent variable and grinc_wage (gross weekly wage) and children (number of children) as independent variables. This order, the first variable being the dependent variable followed by the independent variables, is common to all estimation methods. Stata automatically includes a constant. If you do not want to have a constant included in the regression, you need to specify the option noconstant. 37

38 The regression specified above produces the following output in the Results Window: Note that the estimation is based on 250 observations, while we have 500 observations in our dataset (see the results from describe in section 2.2). How did we lose half of our observations? The reason is that for 250 observations, one of the three variables used in the regression is missing. Important note (repeated from section 4.6): In regressions, observations that have a missing value for any one of the specified variables are NOT taken into account in the estimation! Adding one variable with quite some missings to your model, may therefore dramatically decrease your sample size and hence affect the results. This is particularly true if your regression model has many variables, each of them having many missing values. To make nice regression tables which can be exported to Excel or LaTeX, install the estout package 3 : ssc install estout To see how the above regression results look in estout, type: eststo clear eststo: regress hours grinc_wage children esttab eststo clear clears all previous regression results. eststo: <regression> stores the regression. esttab displays the regression results. The result is: 3 Package documentation for estout can be found at 38

39 This type representation of the regression results is now ready to be included in a report or (with slight adaptation) in an academic paper. To export the table into Excel or LaTeX type: esttab using <filename> [, options] where the filename extension specifies to which program you would like to export the table, and the options specify how exactly the table should be formatted. See the eststo help file and.pdf documentation for more information on its (many) options. Another popular package for handling regression output is outreg Post-estimation commands For every estimation method, there are also some post-estimation commands available. These commands use the results of the last estimation in memory (=the last one you performed) to provide additional information on the estimation. Which post-estimation commands are available for a particular estimation method can be found in the Help-file. The lemma of the estimation method always provides a link to a list of available postestimation commands. For regress, one of the available post-estimation commands is predict. predict pred_hours, xb uses the estimation results to provide a linear prediction based on the estimation results, stored under the new variable name that is specified (here pred_hours ). The added option xb indicates using a linear prediction. The command predict can not only provide a prediction based on the estimation results, but has many more options. For example, predict resid, residuals 39

40 creates a new variable called resid that contains the residuals of the last estimation. 8.5 Storing estimation results The estimation results in memory can be stored with the command estimates store. For example, estimates store model1 saves the estimation results of the last estimation under the name model1. With estimates dir you get a list of all the estimation results you stored. The names you gave to the estimation results appear in blue. To see the estimation results of model1 on your screen again, click on the blue word model1 in the Results window. If you want to perform post-estimation commands on one of the estimation results, you first have to load them into Stata s memory (make them active ) with estimates restore model1 Estimates has some other useful subcommands: estimates drop model1 drops the estimation results stored as model1. estimates clear drops all estimation results that are stored. estimates query tells you whether the results currently in memory (the active results) have been stored already, and if so under which name they have been stored. 8.6 All estimation results in one table Why would you want to store estimation results? For example, if you want to compare two different sets of estimations. A nice feature of Stata using stored estimation results is the possibility to create tables containing the results of multiple estimations. The command for this is estimates table. estimates table model1 model2 creates a simple table with only the estimated coefficients of model1 and model2: 40

41 Moreover, there are many options available to customize the table the way you like it. As is common to all commands, options are place after the comma. To place stars next to the coefficients to denote their significance level, use the option star( ). The significance levels can be chosen freely, but these are the conventional ones in Economics. To include standard errors in the table add se as an option. Note that Stata does not allow adding both standard errors and significance stars. It is also possible to include other statistics than the estimated coefficients. All scalars stored along with the estimation under e( ) (more on this in section 9.3) can be included. To include one or more of these statistics, add the option stats(scalarlist) where scalarlist is a list of the statistics you want to add to the table. For example, creates the following table estimates table model1 model2, star( ) stats(n) The estout command has an even easier way of storing and displaying regression results, see the help file (after installing estout manually with the command ssc install estout), or the.pdf documentation. 9 Miscellaneous Topics 9.1 Reading a dataset with a different format When you have a dataset in a format different from.dta, there best way to proceed depends on the format of your data. 41

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

For many people, learning any new computer software can be an anxietyproducing

For many people, learning any new computer software can be an anxietyproducing 1 Getting to Know Stata 12 For many people, learning any new computer software can be an anxietyproducing task. When that computer program involves statistics, the stress level generally increases exponentially.

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

Introduction to Stata - Session 2

Introduction to Stata - Session 2 Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.

More information

Introduction to Stata - Session 1

Introduction to Stata - Session 1 Introduction to Stata - Session 1 Simon, Hong based on Andrea Papini ECON 3150/4150, UiO January 15, 2018 1 / 33 Preparation Before we start Sit in teams of two Download the file auto.dta from the course

More information

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand Workshop for empirical trade analysis December 2015 Bangkok, Thailand Cosimo Beverelli (WTO) Rainer Lanz (WTO) Content a. Resources b. Stata windows c. Organization of the Bangkok_Dec_2015\Stata folder

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

Opening a Data File in SPSS. Defining Variables in SPSS

Opening a Data File in SPSS. Defining Variables in SPSS Opening a Data File in SPSS To open an existing SPSS file: 1. Click File Open Data. Go to the appropriate directory and find the name of the appropriate file. SPSS defaults to opening SPSS data files with

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 Revised September 2008 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab After having logged in you have to log

More information

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software. Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means

More information

An Introduction to Stata Part I: Data Management

An Introduction to Stata Part I: Data Management An Introduction to Stata Part I: Data Management Kerry L. Papps 1. Overview These two classes aim to give you the necessary skills to get started using Stata for empirical research. The first class will

More information

Week - 01 Lecture - 04 Downloading and installing Python

Week - 01 Lecture - 04 Downloading and installing Python Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and

More information

A quick introduction to STATA

A quick introduction to STATA A quick introduction to STATA Data files and other resources for the course book Introduction to Econometrics by Stock and Watson is available on: http://wps.aw.com/aw_stock_ie_3/178/45691/11696965.cw/index.html

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Memory Addressing, Binary, and Hexadecimal Review

Memory Addressing, Binary, and Hexadecimal Review C++ By A EXAMPLE Memory Addressing, Binary, and Hexadecimal Review You do not have to understand the concepts in this appendix to become well-versed in C++. You can master C++, however, only if you spend

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

Revision of Stata basics in STATA 11:

Revision of Stata basics in STATA 11: Revision of Stata basics in STATA 11: April, 2016 Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka Contents a) Resources b) Stata 11 Interface c) Datasets

More information

Basics of Stata, Statistics 220 Last modified December 10, 1999.

Basics of Stata, Statistics 220 Last modified December 10, 1999. Basics of Stata, Statistics 220 Last modified December 10, 1999. 1 Accessing Stata 1.1 At USITE Using Stata on the USITE PCs: Stata is easily available from the Windows PCs at Harper and Crerar USITE.

More information

Preparing Data for Analysis in Stata

Preparing Data for Analysis in Stata Preparing Data for Analysis in Stata Before you can analyse your data, you need to get your data into an appropriate format, to enable Stata to work for you. To avoid rubbish results, you need to check

More information

Week 1: Introduction to Stata

Week 1: Introduction to Stata Week 1: Introduction to Stata Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline Log

More information

ADePT: Labor. Technical User s Guide. Version 1.0. Automated analysis of the labor market conditions in low- and middle-income countries

ADePT: Labor. Technical User s Guide. Version 1.0. Automated analysis of the labor market conditions in low- and middle-income countries Development Research Group, Development Economics, World Bank ADePT: Labor Version 1.0 Automated analysis of the labor market conditions in low- and middle-income countries Technical User s Guide The ADePT

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 HG Revised September 2011 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab and elsewhere at UiO. At the computer

More information

Introduction to SPSS

Introduction to SPSS Introduction to SPSS Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data file and calculate

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010 INTRODUCTION to Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010 While we are waiting Everyone who wishes to work along with the presentation should log onto

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 22.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013 DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013 GETTING STARTED PAGE 02 Prerequisites What You Will Learn MORE TASKS IN MICROSOFT EXCEL PAGE 03 Cutting, Copying, and Pasting Data Basic Formulas Filling Data

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

Getting started with Stata 2017: Cheat-sheet

Getting started with Stata 2017: Cheat-sheet Getting started with Stata 2017: Cheat-sheet 4. september 2017 1 Get started Graphical user interface (GUI). Clickable. Simple. Commands. Allows for use of do-le. Easy to keep track. Command window: Write

More information

Empirical trade analysis

Empirical trade analysis Empirical trade analysis Introduction to Stata Cosimo Beverelli World Trade Organization Cosimo Beverelli Stata introduction Bangkok, 18-21 Dec 2017 1 / 23 Outline 1 Resources 2 How Stata looks like 3

More information

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

Advanced Regression Analysis Autumn Stata 6.0 For Dummies Advanced Regression Analysis Autumn 2000 Stata 6.0 For Dummies Stata 6.0 is the statistical software package we ll be using for much of this course. Stata has a number of advantages over other currently

More information

EC121 Mathematical Techniques A Revision Notes

EC121 Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level

More information

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project Data Content: Example: Who chats on-line most frequently? This Technology Use dataset in

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information

Programming Fundamentals and Python

Programming Fundamentals and Python Chapter 2 Programming Fundamentals and Python This chapter provides a non-technical overview of Python and will cover the basic programming knowledge needed for the rest of the chapters in Part 1. It contains

More information

An Introduction to Stata By Mike Anderson

An Introduction to Stata By Mike Anderson An Introduction to Stata By Mike Anderson Installation and Start Up A 50-user licensed copy of Intercooled Stata 8.0 for Solaris is accessible on any Athena workstation. To use it, simply type add stata

More information

Civil Engineering Computation

Civil Engineering Computation Civil Engineering Computation First Steps in VBA Homework Evaluation 2 1 Homework Evaluation 3 Based on this rubric, you may resubmit Homework 1 and Homework 2 (along with today s homework) by next Monday

More information

ECO375 Tutorial 1 Introduction to Stata

ECO375 Tutorial 1 Introduction to Stata ECO375 Tutorial 1 Introduction to Stata Matt Tudball University of Toronto Mississauga September 14, 2017 Matt Tudball (University of Toronto) ECO375H5 September 14, 2017 1 / 25 What Is Stata? Stata is

More information

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet. Mr G s Java Jive #2: Yo! Our First Program With this handout you ll write your first program, which we ll call Yo. Programs, Classes, and Objects, Oh My! People regularly refer to Java as a language that

More information

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany STATA Tutorial to Accompany Introduction to Econometrics by James H. Stock and Mark W. Watson STATA Tutorial to accompany Stock/Watson Introduction to Econometrics Copyright 2003 Pearson Education Inc.

More information

UNIT 4. Research Methods in Business

UNIT 4. Research Methods in Business UNIT 4 Preparing Data for Analysis:- After data are obtained through questionnaires, interviews, observation or through secondary sources, they need to be edited. The blank responses, if any have to be

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction In this exercise, we will learn how to reorganize and reformat a data

More information

If Statements, For Loops, Functions

If Statements, For Loops, Functions Fundamentals of Programming If Statements, For Loops, Functions Table of Contents Hello World Types of Variables Integers and Floats String Boolean Relational Operators Lists Conditionals If and Else Statements

More information

An Introduction to Stata

An Introduction to Stata An Introduction to Stata Instructions Statistics 111 - Probability and Statistical Inference Jul 3, 2013 Lab Objective To become familiar with the software package Stata. Lab Procedures Stata gives us

More information

A Short Introduction to STATA

A Short Introduction to STATA A Short Introduction to STATA 1) Introduction: This session serves to link everyone from theoretical equations to tangible results under the amazing promise of Stata! Stata is a statistical package that

More information

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506.

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506. An Introduction to EViews The purpose of the computer assignments in BUEC 333 is to give you some experience using econometric software to analyse real-world data. Along the way, you ll become acquainted

More information

Intermediate Excel 2003

Intermediate Excel 2003 Intermediate Excel 2003 Introduction The aim of this document is to introduce some techniques for manipulating data within Excel, including sorting, filtering and how to customise the charts you create.

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Chapter 2 Assignment (due Thursday, April 19)

Chapter 2 Assignment (due Thursday, April 19) (due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should

More information

Microsoft Excel Level 2

Microsoft Excel Level 2 Microsoft Excel Level 2 Table of Contents Chapter 1 Working with Excel Templates... 5 What is a Template?... 5 I. Opening a Template... 5 II. Using a Template... 5 III. Creating a Template... 6 Chapter

More information

The Very Basics of the R Interpreter

The Very Basics of the R Interpreter Chapter 2 The Very Basics of the R Interpreter OK, the computer is fired up. We have R installed. It is time to get started. 1. Start R by double-clicking on the R desktop icon. 2. Alternatively, open

More information

printf( Please enter another number: ); scanf( %d, &num2);

printf( Please enter another number: ); scanf( %d, &num2); CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful

More information

Full file at

Full file at Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class

More information

T H E I N T E R A C T I V E S H E L L

T H E I N T E R A C T I V E S H E L L 3 T H E I N T E R A C T I V E S H E L L The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. Ada Lovelace, October 1842 Before

More information

Using Microsoft Excel

Using Microsoft Excel Using Microsoft Excel Introduction This handout briefly outlines most of the basic uses and functions of Excel that we will be using in this course. Although Excel may be used for performing statistical

More information

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 05 I/O statements Printf, Scanf Simple

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

Intro To Excel Spreadsheet for use in Introductory Sciences

Intro To Excel Spreadsheet for use in Introductory Sciences INTRO TO EXCEL SPREADSHEET (World Population) Objectives: Become familiar with the Excel spreadsheet environment. (Parts 1-5) Learn to create and save a worksheet. (Part 1) Perform simple calculations,

More information

EXCEL BASICS: MICROSOFT OFFICE 2007

EXCEL BASICS: MICROSOFT OFFICE 2007 EXCEL BASICS: MICROSOFT OFFICE 2007 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT EXCEL PAGE 03 Opening Microsoft Excel Microsoft Excel Features Keyboard Review Pointer Shapes

More information

Create your first workbook

Create your first workbook Create your first workbook You've been asked to enter data in Excel, but you've never worked with Excel. Where do you begin? Or perhaps you have worked in Excel a time or two, but you still wonder how

More information

Handling Your Data in SPSS. Columns, and Labels, and Values... Oh My! The Structure of SPSS. You should think about SPSS as having three major parts.

Handling Your Data in SPSS. Columns, and Labels, and Values... Oh My! The Structure of SPSS. You should think about SPSS as having three major parts. Handling Your Data in SPSS Columns, and Labels, and Values... Oh My! You might think that simple intuition will guide you to a useful organization of your data. If you follow that path, you might find

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

1. Introduction to Microsoft Excel

1. Introduction to Microsoft Excel 1. Introduction to Microsoft Excel A spreadsheet is an online version of an accountant's worksheet, which can automatically do most of the calculating for you. You can do budgets, analyze data, or generate

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 21.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

CSCI 1100L: Topics in Computing Lab Lab 11: Programming with Scratch

CSCI 1100L: Topics in Computing Lab Lab 11: Programming with Scratch CSCI 1100L: Topics in Computing Lab Lab 11: Programming with Scratch Purpose: We will take a look at programming this week using a language called Scratch. Scratch is a programming language that was developed

More information

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation Subject index Symbols %fmt... 106 110 * abbreviation character... 374 377 * comment indicator...346 + combining strings... 124 125 - abbreviation character... 374 377.,.a,.b,...,.z missing values.. 130

More information

Spreadsheet Functions

Spreadsheet Functions Class Description This is an introduction to the use of functions in spreadsheets, with a focus on Microsoft Excel and Google Drive Spreadsheets. The main topics are arithmetic calculations and order of

More information

MATLAB Project: Getting Started with MATLAB

MATLAB Project: Getting Started with MATLAB Name Purpose: To learn to create matrices and use various MATLAB commands for reference later MATLAB built-in functions used: [ ] : ; + - * ^, size, help, format, eye, zeros, ones, diag, rand, round, cos,

More information

This chapter is intended to take you through the basic steps of using the Visual Basic

This chapter is intended to take you through the basic steps of using the Visual Basic CHAPTER 1 The Basics This chapter is intended to take you through the basic steps of using the Visual Basic Editor window and writing a simple piece of VBA code. It will show you how to use the Visual

More information

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Table of Contents Introduction!... 1 Part 1: Entering Data!... 2 1.a: Typing!... 2 1.b: Editing

More information

EXCEL BASICS: MICROSOFT OFFICE 2010

EXCEL BASICS: MICROSOFT OFFICE 2010 EXCEL BASICS: MICROSOFT OFFICE 2010 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT EXCEL PAGE 03 Opening Microsoft Excel Microsoft Excel Features Keyboard Review Pointer Shapes

More information

Excerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.

Excerpt from Art of Problem Solving Volume 1: the Basics 2014 AoPS Inc. Chapter 5 Using the Integers In spite of their being a rather restricted class of numbers, the integers have a lot of interesting properties and uses. Math which involves the properties of integers is

More information

Introduction to Stata

Introduction to Stata Introduction to Stata Introduction In introductory biostatistics courses, you will use the Stata software to apply statistical concepts and practice analyses. Most of the commands you will need are available

More information

Spectroscopic Analysis: Peak Detector

Spectroscopic Analysis: Peak Detector Electronics and Instrumentation Laboratory Sacramento State Physics Department Spectroscopic Analysis: Peak Detector Purpose: The purpose of this experiment is a common sort of experiment in spectroscopy.

More information

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set Goal in video # 25: Learn about how to use the Get & Transform

More information

Excel Primer CH141 Fall, 2017

Excel Primer CH141 Fall, 2017 Excel Primer CH141 Fall, 2017 To Start Excel : Click on the Excel icon found in the lower menu dock. Once Excel Workbook Gallery opens double click on Excel Workbook. A blank workbook page should appear

More information

Burning CDs in Windows XP

Burning CDs in Windows XP B 770 / 1 Make CD Burning a Breeze with Windows XP's Built-in Tools If your PC is equipped with a rewritable CD drive you ve almost certainly got some specialised software for copying files to CDs. If

More information

Learning Worksheet Fundamentals

Learning Worksheet Fundamentals 1.1 LESSON 1 Learning Worksheet Fundamentals After completing this lesson, you will be able to: Create a workbook. Create a workbook from a template. Understand Microsoft Excel window elements. Select

More information

EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy

EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy Virus check and backups: Remember that if you are receiving a file from an external source a government agency or some other source, for example you will want

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

Microsoft Excel 2010 Basics

Microsoft Excel 2010 Basics Microsoft Excel 2010 Basics Starting Word 2010 with XP: Click the Start Button, All Programs, Microsoft Office, Microsoft Excel 2010 Starting Word 2010 with 07: Click the Microsoft Office Button with the

More information

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data Introduction About this Document This manual was written by members of the Statistical Consulting Program as an introduction to SPSS 12.0. It is designed to assist new users in familiarizing themselves

More information

tabulate varname [aw=weightvar]

tabulate varname [aw=weightvar] 1 Commands Introduced In this chapter you will learn these Stata basics: How to obtain information about a dataset How to obtain information about variables How to write and save a Do-file (a file that

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to STATA ECON 330 Econometrics Prof. Lemke An Introduction to STATA ECON 330 Econometrics Prof. Lemke 1. GETTING STARTED A requirement of this class is that you become very comfortable with STATA, a leading statistical software package. You were

More information

A Short Guide to Stata 10 for Windows

A Short Guide to Stata 10 for Windows A Short Guide to Stata 10 for Windows 1. Introduction 2 2. The Stata Environment 2 3. Where to get help 2 4. Opening and Saving Data 3 5. Importing Data 4 6. Data Manipulation 5 7. Descriptive Statistics

More information

2. Getting started with MLwiN

2. Getting started with MLwiN 2. Getting started with MLwiN Introduction This chapter aims to provide you with some practice with MLwiN commands before you begin to fit multilevel models. It is may be helpful if you have already read

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Introduction

More information