Working with SOEP Data

The following exercises are taken from our SOEP Campus Workshops, a service especially for young scholars in the disciplines of sociology, economics, and psychology. Here we provide introductions to the use of the SOEP data.

In order to familiarize yourself with the SOEP data as best as possible as a new user, you should first familiarize yourself with the tracking data.

Tracking data are the basis for linking your research-relevant variables. In addition to various demographic information, tracking data also provide information on how the interview was conducted. These datasets should be understood as initial data that you can use to merge your research-relevant variables via the individual and household numbers.

Since 2021 information around the survey is saved in the instrumentation dataset. This dataset provide information on the interview itself, e.g. date of the interview or mode of the interview.

To this tracking data, users should merge variables they want to analyze. For putting together cross-sectional data sets, these exercises are helpful:

The SOEP team recommends using the long data sets. How to merge longitudinal datasets and what user service these datasets provide can be seen in these exercises:

To get an idea of the analysis potential offered by the SOEP, we recommend the following exercises to our users:

In order to gain the best possible insight into how to work with the various regional data on the SOEP, we recommend the following exercises:

If you want to import the SOEP data as csv files with an older version of Stata, this exercise will help you.

If you want to import the SOEP data as opendf files in R, this page will help you.

Combining datasets is not always as easy as it seems. Here are some example on merging SOEP datasets in STATA