Jennie Murack
You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics Resources to use for future help
Stata can be accessed on any Athena machine. Some departments may have licenses for windows machines. By default, Stata looks for data in your Home directory. To change to another folder in that directory, you only need to type the folder name, not the full pathname. Ex. cd data vs. cd c://murack/home/data The Stata toolbar menu displays at the top of the Athena window, not at the top of the Stata window. Sometimes the Data drop down list does not appear. Close Stata and open it again to fix this.
Results Command Review Variable Do-file (must be opened separately)
Do files let you save commands for future use or sharing. Commands can be run from the do-file. Open a do-file using the dropdown menu or the icon ( ) You can open more than one at once. Make sure you save all your final/correct commands in the do-file for future reference.
Do-file
Stata can copy everything that is sent to the Results window into a Log File. Start a log file using: File > Log > Begin.smcl file type will only open in Stata, select.log, which is a simple text file Records commands and output Does not include graphic output (such as histograms) The command proceeds each result.
File>Log>View to see.log file. You can hit the refresh button if you run more commands. File>log>Suspend/Resume to pause if you are testing out a command
All commands are lower case. Press enter to execute a command. Use the Search or Help commands to learn more. Procedures can be conducted using the windows or by entering a command. Use the windows if you are unsure of a command and then copy the command for future use.
Command variable(s), options Example: summarize sex marriage, detail If you don t specify a variable, Stata will often perform the command on the entire dataset. You can find command-specific syntax in the help files (help commandname)
Use the Page Up button to recall the last command you wrote in the command window. A limited amount of data will be displayed in the results window. You can press any key in the command window to make the data scroll down.
Set your working directory (File > Change working directory or Command: cd directory path ) The working directory is where Stata will look for a file or save a file unless you specify the full path to a different directory.
To retrieve your data file: File > Open Command: use datasetname.dta Saving your data file: File > Save Command: save newdatasetname.dta This command should be followed by, replace if you are writing over an existing file: Example: save censusdata.dta, replace
Datasets in a variety of formats can be imported. Use,clear after a command when importing/ opening data.
File > Import > Text data (delimited, *.csv, ) Command: import delimited filename.csv, clear SAS transport files File > Import > SAS XPORT Command: import sasxport filename.xpt
SPSS/PASW will allow you to save your data as a Stata file Go to: File > Save as > Stata (use the most recent version available) Then you can open it in Stata. Can also use the program StatTransfer
Can use File > Import> Excel spreadsheet or copy and paste your excel file directly into the data editor Command: import excel filename.xlsx, sheet( sheetname") firstrow clear After you paste, you will see this prompt: Select Treat first row as variable names
Command for exporting to.csv: export delimited using newfilename.csv, replace
Describe what the file does at the top Add commands for the following: opening a log file opening data Specify the entire directory if the file is not in the working directory. saving data under a new name (if making changes to the dataset)
/*DESCRIPTION OF FILE*/ cd ~/StataIntro log using logname (specify.log if you are not using.smcl) use datasetname.dta, clear save newdata.dta
* at the beginning of a line marks a text comment, you can also use /* and */ before and after. Use,clear after opening a dataset to clear old data. Enter denotes commands. /// at the end of a line tells Stata that the next line is a continuation of the previous one. Use the copy button from dialog boxes to get commands. Remember to save the do-file. You can highlight/copy/paste into Word, but you may need to do some formatting.
Keep all your files and your do-file in the same directory. They will be easy to locate and you only need to change the Stata working directory once. Give files names that make sense to you. You might want to include dates, locations, etc. Copy all commands into a do-file. Write text comments in your do-file that will help you or a colleague understand the data in the future. These might includes things such as where the data came from, why you ran a certain procedure, etc.
Data Editor Commands such as: List Describe Codebook
To open the data editor: Data>Data Editor> Data Editor (edit) Command: edit To look at data, but not change it: Data > Data Editor > Data editor (browse) Command: browse Or use icons:
Data > Describe data > List data Command: list variable Lists all responses for a specified variable. For long data files, you should specify a subset of ids to list: example: list sex if id in 1/50
Data > Describe data > Describe data in memory or in a file Command: describe Describe lists the variable names, value labels, and variable labels.
Data > Describe data > Describe data contents (codebook) Command: codebook variable1 All variables will be displayed if you do not specify some. Examines the variable names, labels, and data (range, frequency, missing values).
Variable Labels are the names of each variable. Value labels are the labels that you assign the values of each variable. Example: 1 = Poor, 2 = Fair, 3 = Good, etc. You can access these using the Variables Manager ( ).
Variable Labels Type in a variable label or use these commands: Command: rename variablename newvariablename Example: rename v1 marital Command: label variable variablename newvariablelabel Example: label variable var2 income
Click on Manage next to Value labels and the Manage Value Labels box appears. To create a new label, click Create label Enter a label name, a value, and a label for that value. Command: label define sex 0 "Female" 1 "Male"
Value Labels -Your new value label will appear in the Manage Value Labels window. -After you close the window, you can choose the value label from the drop down box in the variable manager. Command: label values var1 sex
Code missing variables: Up to 27 missing values for each variable, represented as letters Data>Create or change data>other variable transformation commands>change numeric values to missing Command: Mvdecode _all, mv (-5=.a\-4=.b)
Duplicating variables: o o o o Keeps missing values coded the same way as in the old variable. Keeps the same value labels Data > Create or change data > Other variable-creation commands > Clone existing variable Command: clonevar newname = oldname Generate: Does not transfer value labels, but does transfer codes for missing variables. Can use basic arithmetic with this command Data>Create or change data>create new variable Command: generate newvar =exp [if] [in] Use generate newvariable =. to create a new, blank column.
Command: generate newvar =exp [if] [in]
Recode changes the values of numeric variables according to the rules specified. Data> Create or change data> Other variable-transformation commands> Recode categorical variable Command: recode varlist (rule) [(rule)...] [, generate(newvar)]
Choose variables and enter recode numbers in parentheses (0=4) (1=3) Enter new names in the options tab, separated by a space. You can add r to the new names to remember they are reverse coded.
Used to replace the contents of an existing variable. Useful for creating new categories from numeric variables or performing calculations to create new variables. Make sure to duplicate your data to a new column first or create a new, blank column, so you don t overwrite data. Data > Create or change data > Change contents of variable Command: replace highincome = 1 if inc>15 & inc<26
In the variable manager, select variable(s) to drop or keep and right click. Command: drop variable or keep variable Drop if or keep if to select certain records: Data > Create or change data > Drop or keep observations Command: drop if exp or keep if exp Examples: drop if age < 12, drop if age > 17 & gender == 1 o == means is
== is / means to example: keep in 1/500! = or ~= is not or is not equal to > is greater than >= is greater than or equal to < is less than <= is less than or equal to
Describe List Codebook Summarize Tabulate
Statistics > Summaries, tables and tests > Summary and descriptive Statistics > Summary statistics Command: summarize variable1 variable2 Add,details at the end of the command to see more summary statistics, including the median (displayed at 50 th %) If you do not specify variables, it will summarize all. It includes the number of observations, mean, standard deviation, minimum, and maximum.
Creates a basic table of descriptive statistics for each variable. Most useful for categorical variables. Statistics > Summaries, tables and tests > Frequency tables > One-way table Command: tabulate variable1
Produces a table for each variable listed. Statistics > Summaries, tables and tests > Frequency tables > Multiple oneway tables Command: tab1 variable1 variable2
Tabulate followed by two variables produces a 2-way table of frequency counts for the combinations of values of those variables. Statistics > Summaries, tables and tests > Frequency tables > Two-way table with measures of association Command: tabulate variable1 variable2
Tab2 takes a list of variable names and issues the tabulate command separately for each possible pair of variable names. Statistics > Summaries, tables and tests > Frequency tables > All possible two-way tables Command: tab2 variable1 variable2 variable3
Tabstat allows you to make customized tables with more statistics. Statistics > Summaries, tables and tests > Other tables > Compact table of summary statistics
Use the graphs menu to make various charts
Graphics > Histogram Command: histogram variable, options Choose your variable of interest Select whether they are discrete or continuous Select Bar properties to edit the color, outline, bar width and bar gap.
-Use the By tab to select a grouping variable. -Use the Y axis and X axis tabs -To add normal curve, check add normal-density plot under density plots tab -Use the Y axis and X axis tabs to add titles, alter the scale, and value labels.
Command: histogram marital, discrete xtitle(marrital Status) xlabel(, valuelabel) (start=1, width=1)
Right click on graph to start the graph editor Use Plot region to change individual colors, explode, etc Click on the T on left side to add text You can start the graph recorder. Name the recording. Run it with different data and the same changes will be made to future graphs.
Graphics > Pie Chart Command: graph pie, over(marital)
Graphics > Box plot Command: graph box variable
libguides.mit.edu/stat Help commands in Stata: Help command (help summarize) Type db to bring up a window to see the available options for a command (db summarize) Search command (search regression) Help > Search Help > PDF Documentation