Data Set Identifiers

Because of the overall data structure with data on different observational levels, any analysis requires the combination of data using matching or merging procedures. These merging procedures need identifiers such that a combination of datasets becomes feasible. The central individual identifier across time is pid, which is fixed over time (and of course datasets). Since a person might change the household in which he or she lives at any point in time, yearly household identifiers called hid are necessary. The same information is also stored in $hhnr, facilitating matching depending on the dataset used. Finally, each individual (respondents as well as children) can be traced back to be a member of or a split-off from an original household from the very first wave. This household’s ID, which is fixed no matter how often a person changes households over the course of time, is called cid. In addition, respondents in long data can be differentiated by survey year. The syear variable can be used to identify a respondent’s survey year. The SOEP provides additional identifiers in the various datasets in order to identify respondents and to provide further possibilities for merging datasets. A excerpt of these additional identifiers can be found here:

Please note that these are not all identifier variables. The name of the identifier variable can change depending on the datset used.

  • parid “Unchanging Individual identifier of Partner (PID)”

  • pgpartnr “Individual Identifier of Partner”

  • coupid “Couple Identifier”

  • intid “Interviewer Identifier”

  • intid1 “Identifier of First Interviewer”

  • $hhnr “Current Wave HH Number (=HHNRAKT/HID)”

  • hhnrold “HH Number Previous Year With Individual Identifier”

  • vpersnr “Individual Identifier of Deceased Person”

  • bymnr “Individual Identifier Mother”

  • byvnr “Individual Identifier Father”

  • mnr “Individual Identifier Mother”

  • fnr “Individual Identifier Father”

  • kidpnr01-kidpnr15 “PERSNR 1st Child” - “PERSNR 15th Child”

  • sibpnr1-sibpnr11 “Individual Identifier, 1st sibling” - “Individual Identifier, 11th sibling”

  • persnre “Permanent Indivdiual Identifier Respondent” (usually mother)

  • pnrtwin “Individual Identifier 2nd Sibling”

  • pnrtrip “Individual Identifier 3rd Sibling”

  • pnrquad “Individual Identifier 4th Sibling”

  • pnralt “Old Household and Individual Identifier”

  • pnrneu “New Household and Individual Identifier”