Data Editions of SOEP-Core¶
Access to SOEP data is provided in compliance with the highest security standards to protect respondents’ confidentiality and maintain their trust in the survey. The data are also provided solely for scientific research purposes, that is, they are only made available to members of the scientific community. This means that researchers are only given access to SOEP data after they have signed a data distribution contract with DIW Berlin. Different data packages, called “editions”, reflect these requirements and can be differentiated by the amount of information contained in them, the level of data protection, and the mode of data access. The EU Edition is considered the standard edition. More restricted editions provide less information; less restricted editions provide more information but are only available under more restrictive conditions. The Teaching, International, and EU Editions Teaching, International, and EU Edition are made available as downloads under the standard data distribution contract, while the two add-ons Area Types and Planning Regions Add-ons: Area Types and Planning Regions require additional contracts. The Remote Edition Remote Edition can only be accessed through remote execution, and the Onsite Edition Onsite Edition can only be accessed on site at the SOEP Research Data Center at DIW Berlin.
In this figure, “more restrictive” means that existing variables from the EU Edition are left blank for reasons of data protection or not all cases are included. For example, variables that provide information at the federal state level are not available in the International or Teaching Editions (which only distinguish between East and West Germany). A higher level of data protection makes it possible to provide more information with fewer restrictions. This makes the editions less restrictive in terms of the information available. In most cases, as more sensitive information is added to an edition, access to the data edition changes and the requirements for its use also change.
Teaching, International, and EU Edition¶
Only the standard data distribution contract is required for the EU Edition and the International Edition. The EU Edition includes 100% of all observations, the German federal states, and the urban/rural variable. This edition is only available to users from research institutions in the EU and countries with an “adequacy decision” (Angemessenheitsbeschluss)—Switzerland, Japan, Canada, Israel, and a few others.
The International Edition is available to users from research institutions in all other countries than those listed above. This edition contains 95% of all households from the first wave of each SOEP subsample based on a random sampling of the original households in each subsample and only the East/West versions of variables normally containing the federal states. The original variables (with information on the federal states) remain in the edition but they are assigned the missing code -7 “Only available in less restricted edition” if a variable cannot be made accessible in a specific edition. For more information on the missing codes in SOEP-Core, see the chapter Missing Conventions.
The least restrictive edition of the data but the one containing the least information is the Teaching Edition. Here, a data distribution contract is required for teaching staff; students only need to sign the data protection declaration, which the contract holder must keep on file. The contract holder is responsible for ensuring strict adherence to data protection. German data protection laws stipulate that a maximum of 50% of all cases in the original dataset may be used for teaching purposes. The Teaching Edition has the same data structure as the International Edition (with the exception of the EU-SILC Clone) but contains half the number of cases in the EU Edition. The Teaching Edition provided to students must be stored in a separate hard drive area to which the user guarantees controlled access. Students may under no circumstances take data home with them or transfer the data to any other device at the university.
Add-ons: Area Types and Planning Regions¶
In addition to the EU Edition, the SOEP offers additional datasets that can extend the standard file to include municipality size classes (add-on: Area Types) or even spatial planning units (add-on: Planning Regions). Access to these files is more restricted because they provide users with more sensitive information about the respondents.
For the add-on Area Types, a regional data contract is required in addition to the data distribution contract. This requires that the user submit a data protection concept to the SOEP. There is no template for this; users must develop this concept specifically for the workplace in which they want to use the data.
For the add-on Planning Regions, a regional data contract is also necessary, and the SOEP requires that users submit a data protection concept that they have developed themselves. For this add-on, however, the requirements for the data protection concept are significantly higher.
Further information such as official county codes (KKZ), identifying administrative districts (Landkreise) and urban districts (kreisfreie Städte) can be accessed through remote execution using the Remote Edition (or on site). For this edition, users are required to submit an application to use SOEPremote in addition to the data distribution contract. For the remote execution contract, no separate data protection concept is required, as users will only access the information remotely and no files are transmitted to computers outside of the Research Data Center of the SOEP (RDC SOEP).
To access the Remote Edition, there are two options available:
SOEPremote execution (e-mail processing)
SOEPremote access (on-site processing at special workstations)
With SOEPremote execution, users can email their Stata syntax to a remote server, which processes the syntax and returns the results to users by email. With SOEPremote access, users can use IGEL clients at RDC SOEP in Berlin. By using the IGEL clients, onsite users have the advantage of working directly with the Remote Edition instead of having to go through the email procedure. The disadvantage is having to plan and book a visit to the RDC SOEP in Berlin.
The Onsite Edition is the edition with all available information. Guests using RDC SOEP IGEL clients (How to Use SOEP IGEL) can access the additional information about the municipalities or postal codes of the SOEP households or data from microm GmbH on households’ neighborhoods. Users can even analyze geocoded data. To access these data, researchers are first required to sign a data protection agreement, and a complete record is kept of all data access. The concept for providing the geo-coordinates of SOEP households is that the point coordinates are kept separate from the actual survey information throughout the entire process of analysis by data users due to privacy concerns. Researchers therefore never have simultaneous access to the SOEP survey data and the geo-coordinates of SOEP households. The results may only be published in completely anonymous form and are checked before they are transmitted from the secure server to the user.
To apply to use a guest work station, click here:
For more information about your workplace at the SOEP Research Data Center see the section How to Use SOEP IGEL
For more information about how to work with SOEP’s spatial data see the section Working with spatial data in R
Last change: May 12, 2022