Public Data set available to students

IPUMS

IMPUS is the ongoing research project that tries to collect the micro-level census data around the wolrd. Many governnments provide the micro-census data to this project and you can download the microdata of the census of each couuntry of participating countries. The list of participating countries is available in here

Advantage: Very large data size. This wiill increase the statistical significance. Repeated every ten years, usually.

Disadvantage: information is mostly restricted to labor market information, housing and fertility. Information on income, earning, wage rate are not available. The consumption data are not available.

 

Demographic Health Survey

DHS is an another ongoing project to collect micro-data on health, family and other demographic information. Many countries are participating in this project. The list of the participating countries is here

The advantage of this data set is the availabiity of the detail information on health outcome and health behavior. Also due to the concern of HIV, they collect the information on sexual behavior. Also in some countries, they collect the geographical information(latitude and lognitude). Combing with other data sets, I found that this data set can be very powerful. Sample size is not so big as the census, of course. But sufficiently large to run the IV regression. With 2000 observation, running IV regression is quite difficult. You need more. DHS has usually more than 10,000.

 

Living Standards Measurement Study (LSMS)

In this ongiong research project, the microdata on consumption, saving and livign stardard at the miro-level are colleted. The consumption data in LDCs is veyr rate. Thus, it provides important research opportunity. But the number of smaple is relative small which often makes the statistical inferance very difficult in the presence of edogeniety.

Consumpton and expenditure Data in the US

If you are intereted in the economics of consumption data and test the theory with the data, this is the site that you should look at. This data sets collected the detailed informaiton on consumption behavior. Many cutting edge papers are written using this data set

US Patent Data

This is the data set collected from the US patent office. When a mutinational company developed an innovative product, they usually register not only in their home country but also register in the US because the US is the largest market. This implies that there are huge amount of patent informatoin in the US office. This data set collects all information regarding on the patent registerd at the US patent office. This can be very interesting data for anyone who are interested in innovationa nd firm activity.

 

Panel Study of Income Dynamics(PSID)

This data sets collects the information on the same household and their children and their grand children more than 40 years. This implies that it is possible for reachers to examine the inter-generational transmission of wealth, inequality, jobs, occupations and social status. Many papers are written by using this data set. If you are interesting in the long run wealth accumulation of the household, this is the data set you should look at.

.