The SOEP Samples in Detail¶
Sample A “Residents in the Federal Republic of Germany” is one of the two initial samples of the SOEP and covers private households with a household head, who does not belong to one of the main foreigner groups of “guestworkers” (i.e. Turkish, Greek, Yugoslavian, Spanish or Italian households). Because only a few foreigners are in Sample A it is often called the “West German Sample” of the SOEP. In 1984 it covered 4,524 households.
Sample B “Foreigners in the Federal Republic of Germany” is one of the two initial samples of the SOEP and covers private households with a Turkish, Greek, Yugoslavian, Spanish or Italian household head, which in 1984 constituted the main groups of foreigners in the FRG. Compared to Sample A the population of Sample B is oversampled. The first wave included 1,393 households in Sample B.
Sample C “German Residents in the German Democratic Republic (GDR)” consists of persons in private households where the household head was a citizen of the German Democratic Republic (GDR). This meant that approximately 1.7% of the residential population in the GDR in June 1990 was excluded from the sample as foreigners (who were mostly institutionalized). All in all, 2,179 households represent the starting size of this sample with a sampling probability of about 0.0005.
Sample D “Immigrants” started in 1994/95 with two different samples. In 1994, the first sample D1 had 236 households and in 1995, the second sample D2 had 295 households, leading to a total of 531 households (D1 and D2) in 1995. This sample consisted of households in which at least one household member had moved from abroad to West Germany after 1984. It mainly consists of ethnic Germans migrating from Eastern Europe to Germany. The sampling probability is about 0.0002.
Sample E “Refreshment I” was added in 1998 and is the first sample that was desinged to be representative for all private households in both East and West Germany. It is the first of several regular refreshment samples drawn to increase the overall size of the SOEP, compensate for panel-attrition and cover population changes, e.g. due to migration. The selection scheme used for sample E essentially resembles the one used in subsample A. The number of households in the first wave of subsample E was 1,056, with a sampling probability of about 0.00005. With the data distribution of 2012, parts of subsample E have been extracted into the SOEP Innovation Sample. It is also the first sample in which the Computer Assisted Personal Interview (CAPI) was implemented. Interviews in Samples A-D at this time were completely conducted using Paper-and-Pencil-Interviews (PAPI). To study mode effects, households of sample E were randomly allocated to CAPI and PAPI mode.
Sample F “Refreshment II” covers private households in Germany and substantially increases the sample size of the SOEP in 2000. Experience with the previous samples has shown that migrant households display lower response probabilities. This is why households with at least one adult not having German nationality were oversampled. Sample F covers 6,043 households.
Sample G “High Income” entered the SOEP in 2002 independently from all other subsamples. The original selection scheme required that the responding households had a monthly income of at least DM 7,500 (EUR 3,835), which - due to the lack of an adequate sampling frame - were identified using a telephone screening procedure. This sample of overall 1,224 households increased the potential for analyses in the high income areas, which previously were difficult to conduct because of low case numbers. The derived sampling probability is about 0.0014. Starting with Wave 2 in 2003, the selection scheme for this subsample was changed such that only households with a net monthly income of at least EUR 4,500 were followed.
Sample H “Refreshment III” started in 2006 as a random sample, again independently of all previous subsamples, covering all residential households in Germany. The addition of 1,506 households was drawn with a sampling probability of 0.0001. For the first time in a SOEP subsample, all households were interviewed in the computer-assisted personal interview mode (CAPI).
Sample I “Innovation Sample” started in 2009, where a disproportional sampling design was implemented in order to increase the number of migrant households in the SOEP. In order to do so, an analyses of family names - “onomastic procedure” - was applied. In 2012, Sample I was completely transferred to SOEP-IS, which is why it is excluded in terms of weighting. The cases are nevertheless integrated in SOEP waves Z to BA (2009 and 2010), however, without valid weighting factors.
Sample J “Refreshment IV” started in 2011 as a random sample that was drawn independently of all previous subsamples, covering the residential households in Germany. Again, a disproportional sampling design was implemented in order to increase the number of migrant households in the SOEP. The addition of 3,136 households was drawn with a sampling probability of 0.0002.
Sample K “Refreshment V” started in 2012 as a random sample, drawn independently of all previous subsamples, covering the residential households in Germany. The addition of 1,526 households was drawn with a sampling probability of 0.0001.
Sample L1 “Cohort Sample” covers private households in Germany, in which at least one household member is a child that was born between January 2007 and March 2010. Again migrants identified by an “onomastic procedure” are oversampled. Sample L1 (as well as L2 and L3) was part of the SOEP-related study “Familien in Deutschland” (FiD), which was later integrated into the SOEP in 2014. As part of an evaluation project of the Federal Ministry for Family Affairs, Senior Citizens, Women and Youth (BMFSFJ) and the Federal Ministry of Finance (BMF) the study focused on public benefits in Germany for married people and families. Therefore, the survey instruments of waves BA to BD differ in some parts from those of the other samples.
Sample L2 “Family Types I” covers private households in Germany that meet at least one of the following criteria regarding their household composition: single parents, low income families and large families with three or more children. Similar to Sample G we face the problem that the eligible sub-population is relatively small and an adequate sampling frame is lacking. So again, a preceding telephone screening procedure identifies eligible households.
Sample L3 “Family Types II” covers private households in Germany that meet at least one of the following criteria regarding their household composition: single parents or large families with three or more children. It is conducted analogical to Sample L2 in order to increase the number of cases in these sub-populations.
Sample M1 “IAB-SOEP Migration Sample” was jointly planned and conducted by the Institute for Employment Research (IAB) in Nuremberg and the German Socio-Economic Panel (SOEP) at DIW Berlin in 2013. Register data of the Federal Employment Agency (FEA), the so-called Integrated Employment Biographies (IEB), were used as a sampling frame. The target population consists of individuals in the register as of 31.12.2011 who a) immigrated to Germany since 1995 as well as b) second-generation migrants born after 1976 in Germany.
Sample M2 “IAB-SOEP Migration Sample” in 2015 another migration sample was added with around 1,096 households drawn by using register information of the German Federal Employment Agency. It aimed for the collection of information on households with recent migrants, that is, individuals who immigrated to Germany between 2009 and 2013..
Sample M3/M4 “IAB-BAMF-SOEP Refugee Survey”” in 2016 a new refugee sample was drawn. It is a joint project of the Institute for Employment Research (IAB), the Research Centre of the Federal Office for Migration and Refugees (BAMF-FZ) as well as the Socio-economic Panel (SOEP). The target population of the samples consists of households with individuals who arrived in Germany between January 2013 and January 2016 and applied for asylum or were hosted as part of specific programs of the federal states (irrespective of their asylum procedure and their current legal status). The first part of the sample (M3) was financed with funds from the research budget of the Federal Employment Agency (BA) allocated to the IAB. Sample M4 was funded by the Federal Ministry of Education and Research (BMBF) and has a focus on refugee families.
Sample M5 “IAB-BAMF-SOEP Refugee Survey” Sample M5 is both an enlargement and a refreshment of the former sub-samples M3 and M4. Whereas the target population of M3 and M4 are all people that immigrated to Germany between January 2013 and January 2016 and appeared in the Central Register of Foreigners up to April 2016, M5 adds two new aspects: First, people that immigrated to Germany between January 2013 and January 2016 and made a claim for asylum after April 2016 until January 2017 (refreshment) and, second, people who immigrated to Germany between February 2016 and December 2016 and making a claim for asylum until January 2017 (enlargement). The sampling is similar to sampling of M3 and M4 and we propose, for substantial analyses, to use all three sub-samples jointly. By using all sub-samples together they are representative for people immigrating to Germany and applied for asylum or people who were hosted as part of specific programs of the federal states (irrespective of their asylum procedure and their current legal status).
Sample N “Refreshment Sample (PIAAC-L)” Sample N integrated 2,378 households of former participants of the Program for the International Assessment of Adult Competencies (PIAAC and PIAAC-L) in 2017. This is the most recent addition to the SOEP-Core samples. Fieldwork in sample N was conducted between Mid-March and Mid-August and thus slightly later than the majority of samples A–L1.
More information about “Sample Sizes” and “Panel Attrition” can be found here
Sample Specific Questionnaires¶
In SOEP it is common for special samples to receive extended, adapted and/or integrated questionnaires in the first few years. This ensures that sample specific questions that do not play a role in the main SOEP can also be included. In the following tables you can see, which questionnaires the respective samples received, in which year they ran, in which raw data set they were included and in which long data set they flowed.
From the start of Sample B (foreigners), respondents could complete the individual questionnaire in German or in the respective foreign language. Starting with wave 2 of the panel, there were “old” and “new” survey units (households, persons), and there were survey units with or without certain changes (e.g., households that had or had not moved; individuals who had or had not changed careers). The questionnaires took these changes into account for all sub-groups. Survey procedures and tools were designed to ensure that each subgroup received the right questionnaire for them. This technique as well as the bilingual design of the foreigner questionnaires was retained for waves 3-6. In addition, retrospective information and missing information on temporary drop outs was collected. The “financial statement”, which is now a survey module, was a separate questionnaire in the year 1988.
SOEP researchers were determined to seize the historic opportunity of German reunification to obtain a first baseline measurement of incomes in the “old” GDR currency. The questionnaire was prepared by an East-West working group including DIW Berlin, WZB, Collaborative Research Centre 3, and the ISS at the Academy of Sciences in the GDR, with the participation of Infratest and its partner organization in the GDR. The result was a questionnaire that covered many of the same themes and questions and was structured similarly to the West SOEP questionnaire, but which focused more on the specific situation in the GDR (e.g., the housing situation).
A major shift in the design of SOEP questionnaires took place with Sample J. Due to the increased panel mortality from wave 1 to wave 2 that was observed for the refresher samples F (2000- 2001), H (2006-2007), and I (2009-2010), the biographical module, with an average interview length of 17 minutes, was integrated into wave 1. If this had not been done, no biographical data would have been collected for approximately 20% of all SOEP respondents who would probably not have participated in wave 2. In comparison to the longitudinal samples, data collection in the first wave was focused on the main three questionnaires: the household, the individual, and the youth questionnaire. As the fieldwork in these refresher samples was conducted exclusively by CAPI, it was feasible to include complex modules with event-triggered question loops.
The main focus of Families in Germany (FiD) was on the families and children – the parental questionnaires (filled out by parents about their children) were about twice as long as the comparable questionnaires in SOEP-Core, and questionnaires for the 1-2-year-olds and the 9-10-year-olds were added (as of 2012, SOEP-Core had added a questionnaire for 9-10-year-olds that is partly comparable to the FiD version). In large part, FiD resembled the SOEP. Each adult was asked to answer an individual questionnaire, which, in the first two years, included retrospective questions on childhood, education, and early work experience. In addition, there were several questions designed to capture the challenges families face with regard to the return of mothers into the labor market – with respect to workplace, work schedule, overtime, daycare options, etc.
Following the design shift for refresher samples since Sample J in 2011, respondents have been surveyed on their life history using the “biography questionnaire”, which was integrated into the individual questionnaire from wave 1. This ensures that biographical information will be available for all target persons who provided an individual interview in participating households. Other supplementary questionnaires were not included in the survey instruments given to first-wave respondents to avoid “overburdening” respondents with an extremely lengthy first-wave interview. Questionnaires for the migration boost samples include questions that have been part of SOEP-Core for the last three decades. In addition, the survey covers each respondent’s complete migration history, education, training, and employment history in Germany and abroad, and numerous aspects of cultural and living environments relevant to the social integration of migrants. The household questionnaire is identical to the questionnaire used in the SOEP-Core sample.
As with every other previously established subsample of migrants in the SOEP (M1 and M2), there was a clear need for several deviations from standard SOEP-Core questionnaires to reflect the special characteristics of the target group. Several additional questions concerning migration and integration were incorporated into the individual questionnaire to better field the range of research questions and research goals of the project partners. These included topics such as ethnic background, experiences en route to Germany, language skills, integration courses in Germany, job experience, current occupation, educational background, health, attitudes, and values. The household questionnaire was much more SOEP-related than the individual questionnaire in order to establish longitudinal information on the households.