Working with SOEPhelp


For data users, the SOEP provides assistance in two analysis programs. One is a stata.ado and the other an R package. The application simply has to be installed in the respective program and helpful information on the desired data sets or variables can be provided. In the following, the installation and use of SOEPhelp is explained:

Working with SOEPhelp in R

SOEPhelp is an R package providing help pages for SOEP-Core data sets (top-level folder only) and their variables. Starting with version v35 SOEPhelp is available for R users, too. Using meta data and the documentation available for SOEP-Core data, this package displays the information on data sets and variables in the default R help. Wherever available the package provides labels in English and in German language. The basic information used to create the help pages is taken from the meta data available from the Public Core Documentation on git. Help pages are not available for data sets in the raw and EU-SILC Clone folders.


If you are working on a Windows OS, you need to install the Rtools (get them here) first. The installation can take a little while according to your CPU (between 3 minutes up to over an hour), because the package contains more than 14000 help pages. The most recent version (SOEPhelp_0.38.0.tar.gz) has been build using R 4.3.1.

                 repos = NULL, type = 'source', quiet = TRUE)


Load the package into your library and read the main page carefully.


You can get to the help pages by using familiar R help functions like ? or help() as well as ??.

Example for a dataset


Asking for the help page of the design data set in R will open the following help page. The title provides you with the basic information that the help page belongs to a data set, its name, and a brief information on what the data set contains. This is followed by the description section providing (if available) further description on the data set and a link to the page. In the arguments section you will find the list of variables for the data set. The variable names are linked to their help pages and the variable labels are given in English and German (if available), together with the link to the page. The details section gives information on the number of observations and variables. Finally, the notes section refers to the version of the SOEP-Core.


Example for a variable


Asking for the help page of the sampreg variable in R will open the following help page. The title provides you with the basic information that the help page belongs to a variable and its name. This is followed by the description section providing (if available) the variable label in English and German as well as a link to the page. In the arguments section you will find the corresponding values of the variable and their value labels. The following notes section refers to the version of the SOEP-Core. Finally, the see also section lists data sets which contain this variable.


Working with SOEPhelp in STATA


The following tool is available starting with Version v34 (Wave bh) and Stata Version 12.

The SOEP data contain a wide array of useful additional information. SOEPhelp is a stata.ado that displays documentation on the dataset at hand. It displays information such as variable histories directly in your Stata window.


Open Stata and enter the following command:

net install soephelp, replace from(

The following commands are provided by .ado:

For a general introduction to SOEPhelp, type in the command help soephelp. Here you will find a detailed explanation of the Stata.ado and the different ways to use it. The .ado is available in German and English.


With the command soephelp, using wave specific datasets (subdirectory raw), you receive a basic description of the dataset as well as a list of samples contained in it, including the instruments corresponding to the sample. By clicking on the provided links, you will get to the respective questionnaires or to the dataset on paneldata.


Using soephelp in longitudinal datasets, you also receive a basic description as well as a list of wave-specific datasets that are used to generate the longitudinal version.


If you enter the command soephelp <variable> in a wave-specific data set, you will get detailed information about the variable in question. The question asked in the questionnaire is displayed as well as the samples and instruments in which the question was asked. Additionally, the command offers the corresponding long variable as well as the link of the displayed variable to the documentation at


Conversely, with long data, you receive the wave-specific input variables and datasets used to generate the long-variable.


Since our recent wave (v35) a new stata command option is being introduced. With soephelp, search (string) you reveive a list of variables that contain the respective word or label you are looking for.

For example, you are interestet in variables regarding children in a household. With soephelp, search (child) you are able to see all variables having the word child in their label.


To receive more details on the list of variables, use soephelp, search (child) verbose. Now you have the possibility to click on a variable and a new window opens up with details on the variable, like the question asked in the latest questionnaire, the question`s source, the long or core variable, depending on the data format.


To use this option in english, add en at the end of the option. For example, soephelp, search (child) verbose en.

SOEPhelp is directly linked to the SOEPcompanion.


Contributions of all sorts are very welcome. Issues and requests can be reported to:

Last change: Feb 21, 2024