The SOEP Samples in Detail¶
Sample A “Residents in the Federal Republic of Germany” covers persons in private households with a household head, who does not belong to one of the main foreigner groups of “guestworkers” (i.e. Turkish, Greek, Yugoslavian, Spanish or Italian households). Because only a few foreigners are in Sample A it is often called the “West German Sample” of the SOEP. In 1984 it covered 4,528 households with a sampling probability of about 0.0002.
Sample B “Foreigners in the Federal Republic of Germany” adds persons in private households with a Turkish, Greek, Yugoslavian, Spanish or Italian household head, which in 1984 constituted the main groups of foreigners in the FRG. Compared to Sample A the population of Sample B is oversampled with a sampling probability of about 0.002. The first wave included 1,393 households in Sample B.
Sample C “German Residents in the German Democratic Republic (GDR)” consists of persons in private households where the household head was a citizen of the German Democratic Republic (GDR). This meant that approximately 1.7% of the residential population in the GDR in June 1990 was excluded from the sample as foreigners (who were mostly institutionalized). All in all, 2,179 households represent the starting size of this sample with a sampling probability of about 0.0005.
Sample D “Immigrants” started in 1994/95 with two different samples. In 1994, the first sample D1 had 236 households and in 1995, the second sample D2 had 295 households, leading to a total of 531 households (D1 and D2) in 1995. This sample consisted of households in which at least one household member had moved from abroad to West Germany after 1984. The sampling probability is about 0.0002.
Sample E “Refreshment” was added in 1998, selected from the entire population of private households in Germany. The households were chosen independently from the ongoing panel and its subsamples A through D, with the targets of increasing the number of observations of the general population and preserving its representativity. The selection scheme used for sample E essentially resembles the one used in subsample A. The number of households in the first wave of subsample E was $1,060$, with a sampling probability of about 0.00005. With the data distribution of 2012, parts of subsample E have been extracted into the SOEP Innovation Sample. It is also the first sample in which the Computer Assisted Personal Interview (CAPI) was implemented. Interviews in Samples A-D at this time were completely conducted using Paper-and-Pencil-Interviews (PAPI). To study mode effects, households of sample E were randomly allocated to CAPI and PAPI mode.
Sample F “Refreshment” was selected independently from all other subsamples from the population of private households in 2000. The selection scheme was slightly altered compared to the previous addition in Sample E: while the ’German’ households (all adults greater or equal 16 in the household have German nationality) were selected with a sampling probability of $0.00028$, the ’non-German’ households (at least one adult does not have German nationality) where oversampled with a probability of 0.0005. Overall, the number of added households in subsample F’s first wave amounts to 6,043.
Sample G “High Income” entered the SOEP in 2002 independently from all other subsamples. The original selection scheme required that the responding households had a monthly income of at least DM 7,500 (EUR 3,835), which - due to the lack of an adequate sampling frame - were identified using a screening procedure. This sample of overall 1,224 households increased the potential for analyses in the high income areas, which previously were difficult to conduct because of low case numbers. The derived sampling probability is about 0.0014. Starting with Wave 2 in 2003, the selection scheme for this subsample was changed such that only households with a net monthly income of at least EUR 4,500 were followed.
Sample H “Refreshment” started in 2006 as a random sample, again independently of all previous subsamples, covering all residential households in Germany. The addition of 1,506 households was drawn with a sampling probability of 0.0001.
Sample I “Incentive Sample” started in 2009, where in the first wave, a new incentive scheme was tested to increase participation rates (see also [sec:PanelCare]. The sampling was independent of all other SOEP-samples, adding a total number of 1,531 households to the SOEP. Their sampling probability was 0.00013. This sample remained in the main data distribution for its first two waves (i.e. 2010 and 2011, or waves Z and BA). With the data distribution of 2012, subsample I has been extracted into the SOEP Innovation Sample.
Sample J “Refreshment Sample” started in 2011 as a random sample that was drawn independently of all previous subsamples, covering the residential households in Germany. The addition of 3,136 households was drawn with a sampling probability of 0.0002.
Sample K “Refreshment Sample” started in 2012 as a random sample, drawn independently of all previous subsamples, covering the residential households in Germany. The addition of 1,526 households was drawn with a sampling probability of 0.0001.
Sample L1 “Cohort Sample” covers private households in Germany, in which at least one household member is a child that was born between January 2007 and March 2010. Again migrants identified by an “onomastic procedure” are oversampled. Sample L1 (as well as L2 and L3) was part of the SOEP-related study “Familien in Deutschland” (FiD), which was later integrated into the SOEP in 2014. As part of an evaluation project of the Federal Ministry for Family Affairs, Senior Citizens, Women and Youth (BMFSFJ) and the Federal Ministry of Finance (BMF) the study focused on public benefits in Germany for married people and families. Therefore, the survey instruments of waves BA to BD differ in some parts from those of the other samples.
Sample L2 “Family Types I” covers private households in Germany that meet at least one of the following criteria regarding their household composition: single parents, low income families and large families with three or more children. Similar to Sample G we face the problem that the eligible sub-population is relatively small and an adequate sampling frame is lacking. So again, a preceding telephone screening procedure identifies eligible households.
Sample L3 “Family Types II” covers private households in Germany that meet at least one of the following criteria regarding their household composition: single parents or large families with three or more children. It is conducted analogical to Sample L2 in order to increase the number of cases in these sub-populations.
Sample M1 “Migration Sample” In 2013 a new migration sample was added with around 2,700 households drawn by using register information of the German Federal Employment Agency.
Sample M2 “Migration Sample” in 2015 another migration sample was added with around 1,100 households drawn by using register information of the German Federal Employment Agency.
Sample M3 “Refugee Sample” in 2016 a new refugee sample was drawn for the IAB-BAMF-SOEP Refugee Survey in which roughly 1,769 households of displaced persons are repeatedly interviewed. Respondents aged 18 and older who entered Germany between January 2013 and December 2016 and who filed an asylum application (regardless of their current legal status) were interviewed as well as the members of their households.
Sample M4 “Refugee Family Sample” The 2016 “IAB-BAMF-SOEP Refugee Survey” (Samples M3 and M4) is a joint project of the Institute for Employment Research (IAB), the Research Center of the Federal Office for Migration and Refugees (BAMF-FZ) as well as the Socio-economic Panel (SOEP). The target population of the samples consists of 1,769 households with individuals who arrived in Germany between January 2013 and January 2016 and applied for asylum or were hosted as part of specific programs of the federal states (irrespective of their asylum procedure and their current legal status). The first part of the sample (M3) was financed with funds from the research budget of the Federal Employment Agency (BA) allocated to the IAB. Sample M4 was funded by the Federal Ministry of Education and Research (BMBF) and has a focus on refugee families.
Sample M5 “Refugee Sample” M5 is the acronym for the third top-up sample of refugee households. The population of M5 covers adult refugees who have applied for asylum in Germany since January 1, 2013, and are currently living in Germany. The first wave of M5 was conducted in 2017. M5 added another 1,519 households of refugees who have migrated to Germany since 2013 to the SOEP framework.
Sample N “Refreshment Sample (PIAAC-L)” Sample N integrated 2,314 households of former participants of the Program for the International Assessment of Adult Competencies (PIAAC and PIAAC-L) in 2017. This is the most recent addition to the SOEP-Core samples. Fieldwork in sample N was conducted between Mid-March and Mid-August and thus slightly later than the majority of samples A–L1.
More information about “Sample Sizes” and “Panel Attrition” can be found here