STATA conversion class

September 2009

 

Professor Mark Franklin

 

 

 

 

(SEE BELOW FOR STATA TUTORIAL)

 

This class is intended for researchers who already know how to use SPSS for quantitative research at any level and want to be able to take classes in quantiatitive data analysis using STATA. It should be taken by researchers who have enough experience of quantitative research not to require the "absolute beginners" course. That course contains its own introduction to STATA, so there should be no need for anyone to take both courses.

 

Note that the conversion course will only be taught this one time during the academic year, so it should be taken by researchers who plan to take any quantitative methods classes (including 3rd term workshops) if they do not have experience of using STATA.

 

Register for the class with Alessandra.Torre@EUI.eu (ext 2211).

 

Please bring your laptop to the class with STATA installed. If you do not have a copy of STATA already on your laptop you can get it (along with a site licence) from the Badia Computing Service Site Office (by email to BF-Site@eui.eu stating also whether this is for Windows or MAC platform). It may take 24 hours or more to get the license key from STATA Inc., so be sure to install your copy well ahead of the class.

 

 

Datasets

 

Turnout dataset for Stata 9 or later

 

 

 

 

Class instructions

 

Before the class, read the two-page "Introduction to STATA for SPSS users" and download one of the datasets whose links you see above.

 

 

After the class:

 

Download 'Class instructions' (transcript of lecture)

 

Download visuals (what was on the screen during the lecture)

 

Download Introduction to Stata 8 (September 2004) by Svend Juul,

Department of Epidemiology and Social Medicine, University of Aarhus,

(This is now quite dated but still an excellent introduction to STATA usage)

 

 

STATA TUTORIAL, and tips for data management

 

The following files provide a (somewhat out of date but otherwise excellent) tutorial for those who could not attend the conversion class or who want to refresh and extend what they learned in the class.

 

Tutorial 1    Tutorial 2    Tutorial 3    Tutorial 4    Tutorial 5    Tutorial 6    Tutorial 7

 

Many coming to STATA from SPSS also come with existing datasets that need to be converted. Moreover, converting data from SPSS to STATA format is a constantly recurring need. This is most easily done by invoking "Save as…" from the SPSS File menu and choosing a STATA format in the resulting dialogue box (the various different STATA formats specified there all come to the same thing). There is also a stand-alone program called 'Stat Transfer' that converts pretty much any data format to any other data format, including variable and value labels.

 

However, simply converting the data to STATA format is often only the first chore. Often the SPSS data has missing values that are not converted to STATA missing values (and may not even have been defined as missing values in SPSS). STATA has its own definition of missing values (designated as "." or ".a" to ".z" in the data matrix), which have to be explicitly set by a "generate" or "replace" command in STATA. This is much simpler if one can specify, in a single command, all the variables for which a specific set of missing values need to be defined as missing.

 

In STATA, this can be done using a quite general facility for looping over a series of variables and/or values - useful for much else than dealing with missing data codes. Two methods are provided, one of them documented in the current STATA help files and one of them not (because it is an obsolete facility which, however, is still supported). The documented features are described in the STATA help files under "foreach", "forvalues," and "while". These are quite hard to use, but worth mastering if you have ambitions to actually program STATA do and ado files. The undocumented features are far easier to use and are described in a document available at STATA old-style "for" command, which also contains examples for how these facilities can be used to deal with SPSS-style missing data codes. The ideas presented in these examples might be useful even to those who want to use the current STATA facilities in order to undertake missing data conversion.

 

SPECIAL NOTE regarding MISSING VALUES in STATA

While missing values in STATA are automatically taken into account by statistical and other commands, the  if  qualifier to STATA commands treats missing values as very large numbers, "regress a b if c > 5" includes in the regression analysis values of c greater than 5 and also any missing values. To specify non-missing values greater than 5 one would have to say "if c > 5 & c < ." This curious feature of the STATA command language causes much anguish!