Target Population and Samples¶
The target population covered in the SOEP is defined as the population of private households residing within the current boundaries of the Federal Republic of Germany (FRG). Because of changes in these boundaries (in 1990) and changes in the population due to migration, various adaptations have been made to the initial sampling structure to maintain the sample’s representativity. In addition, certain groups have been oversampled to increase the statistical power.
The different SOEP-Core subsamples constitute the centerpiece of the SOEP.
Within SOEP-Core, samples A-Q form the heart of the SOEP. They contain the oldest samples, beginning with the founding sample in A from 1984 and the highest number of participating households. Fieldwork traditionally starts at the beginning of February, and its questionnaires serve as a master for the other SOEP-Core subsamples.
The SOEP migration sample with it`s samples M1 and M2 was established in 2013 and is designed to improve the representation of migrants living in Germany. Fieldwork started in April, using the questionnaires from samples A-Q, supplemented by translated questionnaires for five different languages.
In order to map recent migration and integration dynamics, SOEP refugee samples M3 to M5 were installed beginning in the year 2016. In 2020, fieldwork began in August with a questionnaire that was tailored to issues of recent refugees while containing many questions from the SOEP samples A-Q as well.
Sample M6 – a boost sample of refugees targeted the same population as the older refugee sample M5 - adult refugees who have applied for asylum in Germany since 1 January 2013 and are currently living in Germany – and the same sample design and sample frame were used.
The two boost samples, samples M7 and M8a, were added the SOEP migration sample system. Like the older migration samples M1 and M2, the Integrated Employment Biographies Sample (IEBS) of the Federal Employment Agency (BA) served as the sampling frame for both boost samples. Boost sample M7’s goal was to capture migration dynamics and processes from 2016 to 2018 with a focus on EU migration. To ensure that statistically significant group comparisons can be made, sampling was restricted to the three most significant countries of origin in that time period: Romania, Bulgaria, and Poland. M8a, on the other hand, was designed to help evaluate the skilled worker immigration law (Fachkräfteeinwanderungsgesetz), which came into effect March 1, 2020, and targeted migrants from third countries that came to Germany between 2017 and 2018, sampling them as a control group for a treatment group that will be sampled at a later date.
In 1984, the survey started with a sample covering the entire population of then West Germany (FRG), where the five biggest groups of foreigners (“guest workers”) were oversampled.
The SOEP was expanded to the territory of the German Democratic Republic in June 1990, only six months after the fall of the Berlin Wall. In 1994/95, a boost sample of migrants who came to Germany after 1984 was added to take the influx of ethnic Germans from former Soviet countries into account. In 2013 another sample of migrants which includes individuals who immigrated to Germany after 1995 or second-generation immigrants was added. Since then, multiple migration or refugees samples were added in cooperation with the IAB (Institut für Arbeitsmarkt- und Berufsforschung) or the BAMF (Bundesamt für Migration und Flüchtlinge)
Now and then samples that were representative of the entire population in Germany were added to counter effects of panel attrition and to increase the overall sample size.
The different samples in the SOEP are identified by letters: sample “A” refers to the German sample drawn in 1984, “C” to the East Germans from 1990, and so on. Even though these samples are kept separate, the respondents have received identical questionnaires for the most part, and distinctions by sample are usually not necessary in an analysis.
However, one of the ideas of the SOEP is that the users have full information available about survey methodological issues and survey design, which in this case means that you can identify the corresponding sample for each observation. In the following section, we present details on each of the samples, which unless stated otherwise are multi-stage random samples with regional clusters. The households are selected by random-walk routines.
For an extensive discussion on sampling (and weighting), see: Survey methods.