Martin Saavedra
Assistant Professor
Department of Economics
Oberlin College

Lasso Industry Demographic and Occupation (LIDO) Score

These data are intended to be an improvement on the traditional Occupational Income Scores from IPUMS. The occupational income score predicts 1950 earnings using only occupation, while the LIDO score predicts earnings using occupation, industry, and demographics. The details are contained in the working paper: A Machine Learning Approach to Improving Occupational Income Scores. There are both theoretical and empirical reasons to believe this should produce estimates closer to an earnings regression. To use the LIDO score, you will need the following variables from IPUMS: region, statefip, sex, age, race, occ1950, and ind1950. It should then only take one line of Stata code to merge in the score. The LIDO score is in hundreds of 1950 dollars. The data are preliminary.

Download the LIDO score.