Target Population and Samples

The target population covered in the SOEP is defined as the population of private households residing within the current boundaries of the Federal Republic of Germany (FRG). Because of changes in these boundaries (in 1990) and changes in the population due to migration, various adaptations have been made to the initial sampling structure to maintain the sample’s representativity. In addition, certain groups have been oversampled to increase the statistical power.

In 1984, the survey started with a sample covering the entire population of then West Germany (FRG), where the five biggest groups of foreigners (“guest workers”) were oversampled.

Institutionalized populations (in the true sense of the word, those living in hospitals, nursing homes, and military installations) are generally not representatively included in new samples. In 1984, for instance, only 57 institutionalized households were included. Later, however, individuals from initial survey households who have since taken up temporary or permanent residence in institutions were surveyed regularly.

The SOEP was expanded to the territory of the German Democratic Republic in June 1990, only six months after the fall of the Berlin Wall. In 1994/95, a boost sample of migrants who came to Germany after 1984 was added to take the influx of ethnic Germans from former Soviet countries into account. Two samples that were representative of the entire population in Germany were added in 1998 and 2000 to counter effects of panel attrition and to increase the overall sample size. In 2002, a high-income boost sample was added, and in 2006 and 2009, additional refresher samples were added.

To increase the overall sample size, SOEP started adding refresher samples in 2011. The first (in 2011) and second (in 2012) are representative of the entire population, whereas the third (2013) covers migrants. For the fourth such sample in 2014, the related study “Families in Germany” was integrated into the SOEP.

The different samples in the SOEP are identified by letters: sample “A” refers to the German sample drawn in 1984, “C” to the East Germans from 1990, and so on. Even though these samples are kept separate, the respondents have received identical questionnaires for the most part, and distinctions by sample are usually not necessary in an analysis.

However, one of the ideas of the SOEP is that the users have full information available about survey methodological issues and survey design, which in this case means that you can identify the corresponding sample for each observation. In the following section, we present details on each of the samples, which unless stated otherwise are multi-stage random samples with regional clusters. The households are selected by random-walk routines.

For an extensive discussion on sampling (and weighting), see: Survey methods.