|
|
|
|
|
|
||
|
STATA conversion class September 2009 Professor Mark Franklin |
||
|
|
||
|
|
(SEE BELOW FOR STATA TUTORIAL)
This class is intended for researchers who already know
how to use SPSS for quantitative research at any level and want to be able to
take classes in quantiatitive data analysis using STATA. It should be taken by
researchers who have enough experience of quantitative research not to require
the "absolute beginners" course. That course contains its own
introduction to STATA, so there should be no need for anyone to take both
courses.
Note that the conversion course will only be taught this one time during the academic year, so it should be taken by researchers who plan to take any quantitative methods classes (including 3rd term workshops) if they do not have experience of using STATA.
Register for the class with Alessandra.Torre@EUI.eu (ext
2211).
Please bring your laptop to the class with STATA
installed. If you do not have a copy of STATA already on your laptop you can
get it (along with a site licence) from the Badia Computing Service Site Office
(by email to BF-Site@eui.eu stating also whether this is for Windows or MAC platform).
It may take 24 hours or more to get the license key from STATA Inc., so be sure
to install your copy well ahead of the class.
Datasets
Turnout dataset
for Stata 9 or later
Class instructions
Before the class, read the two-page "Introduction
to STATA for SPSS users" and download one of the datasets whose links
you see above.
After the class:
Download 'Class
instructions' (transcript of lecture)
Download
visuals (what was on the screen during the lecture)
Download Introduction to Stata 8 (September 2004) by Svend Juul,
Department of Epidemiology and
Social Medicine, University of Aarhus,
(This is now quite dated but still
an excellent introduction to STATA usage)
STATA TUTORIAL, and tips for data management
The following files provide a (somewhat out of date
but otherwise excellent) tutorial for those who could not attend the conversion
class or who want to refresh and extend what they learned in the class.
Tutorial 1 Tutorial 2 Tutorial 3 Tutorial 4 Tutorial 5 Tutorial 6 Tutorial 7
Many coming to STATA from SPSS also come with existing
datasets that need to be converted. Moreover, converting data from SPSS to
STATA format is a constantly recurring need. This is most easily done by
invoking "Save as…" from the SPSS File menu and choosing a STATA
format in the resulting dialogue box (the various different STATA formats
specified there all come to the same thing). There is also a stand-alone
program called 'Stat Transfer' that converts pretty much any data format to any
other data format, including variable and value labels.
However, simply converting the data to STATA format is
often only the first chore. Often the SPSS data has missing values that are not
converted to STATA missing values (and may not even have been defined as
missing values in SPSS). STATA has its own definition of missing values
(designated as "." or ".a" to ".z" in the data
matrix), which have to be explicitly set by a "generate" or
"replace" command in STATA. This is much simpler if one can specify,
in a single command, all the variables for which a specific set of missing values
need to be defined as missing.
In STATA, this can be done using a quite general
facility for looping over a series of variables and/or values - useful for much
else than dealing with missing data codes. Two methods are provided, one of
them documented in the current STATA help files and one of them not (because it
is an obsolete facility which, however, is still supported). The documented
features are described in the STATA help files under "foreach",
"forvalues," and "while". These are quite hard to use, but
worth mastering if you have ambitions to actually program STATA do and ado
files. The undocumented features are far easier to use and are described in a
document available at STATA old-style
"for" command, which also contains examples for how these
facilities can be used to deal with SPSS-style missing data codes. The ideas
presented in these examples might be useful even to those who want to use the
current STATA facilities in order to undertake missing data conversion.
SPECIAL NOTE
regarding MISSING VALUES in STATA
While
missing values in STATA are automatically taken into account by statistical and
other commands, the if qualifier to STATA
commands treats missing values as very large numbers, "regress a b if c
> 5" includes in the
regression analysis values of c greater than 5 and also any missing values. To specify non-missing values
greater than 5 one would have to say "if c > 5 & c < ." This curious feature of the STATA command
language causes much anguish!