I’m working in SAS again, after about eight gleeful years. To be sure, I didn’t miss it, but I’d forgotten all the reasons why.
My data is in SAS, with a format catalog formats.sasb7cat. I want a fully-labeled Stata version of that data. What to do?
The following takes a SAS file and format catalog that reside in the same directory, spits out the format catalog to Stata (you’ll need to export your data in a separate step), turns the data file into a text file containing all the -label define- and -label values- statements that will reproduce the value labels present in the SAS file.
In SAS:
libname mine "c:\temp\sasxport"; libname library "c:\temp\sasxport"; proc format lib=library cntlout=fmtdata; run; proc export data=fmtdata FILE="lbls_converted.dta" DBMS=STATA replace; run;
Now, escaping back to Stata. The following is a supremely lazy way of approaching the problem, but it gets the job done. For one thing, I use -preserve- and -restore- to drop all but the line in which I’m working.
use lbls_converted.dta, clear // my output from SAS local labeldo "convert_dict.do" // this is the do-file that'll contain your generated -label- statements tempname outfile file open `outfile' using `labeldo', write replace // if you've got anything "weird" in your data you want to drop, do it before this line... qui count local nobs=`r(N)' by fmtname, sort: gen rowindex=_n foreach v of varlist _all { // I had a bunch of leading / trailing whitespace -- YMMV cap noi replace `v' = trim(`v') } // this may be a peculiarity of my situation -- some, but not all, formats end in _F // -regexr- would be smarter here, as it could trim only the *final* instance gen vname = subinstr(fmtname,"_F","",1) // replace one instance of _F forvalues row = 1/`nobs' { preserve qui keep in `row' local vname = vname local valname = start local vallab = subinstr(label, "''","",.) if rowindex==1 { *if "`labelline'"!="" file write `outfile' "`labelline'" _n if "`varline'"!="" file write `outfile' _n "`varline'" _n local varline "label values `vname' `vname'" file write `outfile' "label define `vname' " } * local labelline "`labelline' `valname' `vallab'" file write `outfile' `" `valname' "`vallab'""' restore } // end foreach file close `outfile' use "data_converted.dta", clear do `labeldo'
How might this fall apart? Easily. For example, for SAS FORMATs might not be named the same as your variables. I’m sure there’s a step that could solve this easily enough, but I’m fortunate to not have had to deal with it.