Laptop users: you will need a copy of Stata installed on your machine. Harvard FAS affiliates can install a licensed version from http://downloads.fas.harvard.edu/download
Laptop users: you will need a copy of Stata installed on your machine
Lab computer users: log in using your Athena user name and password
Everyone:
To get help in Stata type help
followed by topic or command, e.g., help codebook
.
Most Stata commands follow the same basic syntax: Command varlist, options
.
Start with comment describing your Do-file and use comments throughout
* Use '*' to comment a line and '//' for in-line comments
* Make Stata say hello:
disp "Hello " "World!" // 'disp' is short for 'display'
///
to break varlists over multiple lines:disp "Hello" ///
" World!"
* change directory
// cd "C://Users/dataclass/Desktop/StataIntro"
cd dataSets
// open the gss.dta data set
use gss.dta, clear
// save data file:
save newgss.dta, replace // "replace" option means OK to overwrite existing file
* import data from a .csv file
import delimited gss.csv, clear
* save data to a .csv file
export delimited gss_new.csv, replace
* import/export SAS xport files
clear
import sasxport gss.xpt
export sasxport gss_new, replace
.do
file. cd
) to the dataSets
folder.use gss.dta, clear
sum educ // statistical summary of education
codebook region // information about how region is coded
tab sex // numbers of male and female participants
/* Histograms */
hist educ
// histogram with normal curve; see 'help hist' for other options
hist age, normal
/* scatterplots */
twoway (scatter educ age)
graph matrix educ age inc
* By Processing
bysort sex: tab happy // tabulate happy separately for men and women
bysort marital: sum educ // summarize eudcation by marital status
happy
for married individuals only/* Labelling and renaming */
// Label variable inc "household income"
label var inc "household income"
// change the name 'educ' to 'education'
rename educ education
// you can search names and labels with 'lookfor'
lookfor household
/*define a value label for sex */
label define mySexLabel 1 "Male" 2 "Female"
/* assign our label set to the sex variable*/
label val sex mySexLabel
var | rename to | label with |
---|---|---|
v1 | marital | marital status |
v2 | age | age of respondent |
v3 | educ | education |
v4 | sex | respondent's sex |
v5 | inc | household income |
v6 | happy | general happiness |
v7 | region | region of interview |
value | label |
---|---|
1 | "married" |
2 | "widowed" |
3 | "divorced" |
4 | "separated" |
5 | "never married" |
Operator | Meaning |
---|---|
== | equal to |
!= | not equal to |
> | greater than |
>= | greater than or equal to |
< | less than |
<= | less than or equal to |
& | and |
or |
// create a new variable named mc_inc
// equal to inc minus the mean of inc
gen mc_inc = inc - 15.37
/* the 'generate and replace' strategy */
// generate a column of missings
gen age_wealth = .
// Next, start adding your qualifications
replace age_wealth=1 if age<30 & inc < 10
replace age_wealth=2 if age<30 & inc > 10
replace age_wealth=3 if age>30 & inc < 10
replace age_wealth=4 if age>30 & inc > 10
// conditions can also be combined with "or"
gen young=0
replace young=1 if age_wealth==1 | age_wealth==2
Please take a moment to fill out a very short feedback form
These workshops exist for you – tell us what you need!
IQSS workshops: http://projects.iq.harvard.edu/rtc/filter_by/workshops
IQSS statistical consulting: http://dss.iq.harvard.edu
The RCE