Dataset: Generations and Gender Survey Poland Wave 1 & Wave 2

Abstract

The Generations and Gender Survey (GGS) provides micro-level data with the aim of significantly improving the knowledge base for social science and policymaking in Europe and developed countries elsewhere.  
In Europe 2020, the European Union develops a strategy "to help us come out stronger from the crisis and turn the EU into a smart, sustainable and inclusive economy delivering high levels of employment, productivity and social cohesion". The economic crisis affects not only day-to-day decisions, but also fundamental choices at all stages of people's lives:  marriage and childbearing, the combination of employment and caring responsibilities for the young and the old, retirement, housing, and ageing well. The GGS has been developed to provide scientists with high-quality data to contribute scientifically grounded answers to these key policy questions. Survey content focuses on intergenerational and gender relations between people, expressed in care arrangements and the organization of paid and unpaid work. Key feature of the survey are:  
- Cross-national comparability. In each country data is collected on the basis of a common international questionnaire and guidelines about the methodology. Data processing includes central harmonization of national datasets.  
- A broad age range. It includes respondents between the ages of 18 and 80.
- A longitudinal design. It has a panel design, collecting information on the same persons at three-year intervals.  
- A large sample size. It has an average of 9,000 respondents per country at Wave 1.
- A theory-driven and multidisciplinary questionnaire. It provides data for policy relevant research by demographers, economists, sociologists, social policy researchers, social psychologists and epidemiologists. The questionnaire is inspired by the theory of planned behavior.
- Possibility to combine the survey data with macro data provided by the GGP Contextual Database. This combination enables analyses of individuals and families in their cultural, economic, political, social and policy contexts.

Variable Groups

Document Description

Full Title

Generations and Gender Survey Poland Wave 1 & Wave 2

Alternative Title

GGS Poland Wave 1 & Wave 2

Identification Number

GGSW1.W2.26

Date of Distribution

2018-02-26

Version

Working Version: GGS Wave 1 Version 4.3. and GGS Wave 2 Version 1.3.

Publication of Wave 2 dataset.

Date: 2018-02-26

Guide To Codebook

In the field “Study Description”, users can find metadata about surveys. This includes the distributors, keywords, abstract, and guidelines on the bibliographic citation.  
Country specific metadata include information on survey producers, methodology and processing. This information was provided by the GGP-country teams, based on a metadata grid with pre-structured questions. Links to relevant references (e.g., working papers and questionnaires) are also provided.  

The field “Data Files Description” provides metadata about the data file, such as file contents, missing values, as well as changes across different GGS versions.

The field "Variable Description" provides information on each variable, such as question text, descriptions of country specific categories and variables, universe (i.e., subset of respondents to whom the question was asked), country specific deviations to GGS routing, descriptions of the ways in which consolidated and derived variables are calculated. Variables are ordered according to the sections of the GGS codebook.

PLEASE NOTICE THAT WE DOCUMENT ONLY VARIABLES HAVING VALID CASES.  
VARIABLES HAVING ALL SYSTEM MISSING CASES ARE NOT DOCUMENTED.  
This is the reason why the total no. of variables in the documentation is smaller than the total number of variables in the SPSS and STATA files.

Full Title

GGS_W1-V.4.3.&W2-V.1.3_Poland

Producer

Name Affiliation Abbreviation Role
Arianna Caporali Institut national d'études démographiques (INED) AC

Study Description

Full Title

Generations and Gender Survey Poland Wave 1 & Wave 2

Alternative Title

GGS Poland Wave 1 & Wave 2

Parallel Title

Generacje, rodziny i plec kulturowa (GGS-PL)

Identification Number

GGSW1.W2.26

Authoring Entity

Name Affiliation
Irena E.Kotowska (Primary Investigator Wave 1 and Wave 2) Institute of Statistics and Demography, Warsaw School of Economics (ISD WSE)
Janina Józwiak (Primary Investigator Wave 1) Institute of Statistics and Demography, Warsaw School of Economics (ISD WSE)
Monika Mynarska (Primary Investigator Wave 2) Institute of Statistics and Demography, Warsaw School of Economics (ISD WSE)
Katarzyna Kocot-Górecka (Primary Investigator Wave 2) Institute of Statistics and Demography, Warsaw School of Economics (ISD WSE)

Other identifications and acknowledgments

Name Affiliation Role
Bureau of Research and Statistical Analyses Polish Statistical Association Sample design, data cleaning, weighting (Wave 1 & Wave 2)
Team of interviewers Central Statistical Office Fieldwork (Wave 1 & Wave 2)

Producer

Name Affiliation Abbreviation Role
Janina Józwiak Institute of Statistics and Demography, Warsaw School of Economics ISD WSE Project leader (Wave 1)
Irena E.Kotowska Institute of Statistics and Demography, Warsaw School of Economics ISD WSE Project coordinator (Wave 1), Project leader (Wave 2)
Monika Mynarska Institute of Statistics and Demography, Warsaw School of Economics ISD WSE Project coordinator (Wave 2)
Katarzyna Kocot-Górecka Institute of Statistics and Demography, Warsaw School of Economics ISD WSE National coordinator (Wave 2)

Funding Agency/Sponsor

Name Abbreviation Role Grant
Ministry of Science and High Education Funding of Wave 1 554/N-UNECE/2009/0
National Science Centre (Poland) Funding of Wave 2 2013/08/M/HS4/00421

Data Distributor

Name Affiliation Abbreviation
Institut national des études démographiques - 133 boulevard Davout 75980 Paris Cedex 20, France. INED
Netherlands Interdisciplinary Demographic Institute - Lange Houtstraat 19, NL-2511 CV The Hague, The Netherlands NIDI

Depositor

Name Affiliation Abbreviation
Institute of Statistics and Demography, Warsaw School of Economics ISD, WSE

Bibliographic Citation

United Nations 2005. Generations & Gender Programme: Survey Instruments. New York and Geneva: UN, 2005.

List of Keywords

Date of Collection

Start End Cycle
2010-10-20 2011-02-28 Wave 1
2014-09 2015-02 Wave 2

Country

Poland  (POL)

Geographic Coverage

The whole terrotory of Poland.

Geographic Unit

The 16 polish provinces (voivodeships).

Unit of Analysis

Individuals

Universe

WAVE 1
Polish speaking persons aged 18-79 (at the moment of survey), living in private households in Poland.                                                                                                                                       

WAVE 2
Panel approach: all persons participationg in Wave 1 were contacted. Additionally, a subsample of persons aged 18-22 was selected, using the same criteria as in Wave 1.

Kind of Data

Survey data

Time Method

Panel

Data Collector

Bureau of Research and Statistical Analyses, Polish Statistical Association

Sampling Procedure

WAVE 1 SAMPLING PROCEDURE
1. Sampling frame
1.1 Type of frame: Lists of geographical units i.e. dwellings list from the National Official Register of Territorial Division of the Country (TERYT).
1.2  Frame coverage: Total population of Poland.
1.3 Frame size: 3798 units of territorial division.
1.4 Level of units available: Dwellings.

2. Sampling method
2.1 Sampling method type: Multistage. The sample has been drawn using a two-stage sampling design. The sample size for voivodeships was established using the square root method. In voivodeships stratified sampling was used. The stratas were: municipalities (gminas), cities or districts (e.g. in Warsaw). The first-stage sampling units (PSU) were enumeration census areas (about 4000), while at the second stage (SSU) dwellings were selected (on average five by each enumeration census area).
2.2 Sampling stage definition
- PSU: Enumeration census areas of 2002 population census.
- SSU: Dwellings.
- TSU: NA.
2.3 Sampling stage size
- PSU: 4,000.
- SSU: 20,000.
- TSU: NA.
2.4 Unit selection: Systematic Random Sampling.
2.5 Final stage unit selection: Simple Random Sampling (SRS).
2.6 Within household unit selection: Last birthday method.
2.7 Stratification: Implicit sampling stratification - the sampling frame included pre-specified strata: communes, cities and districts e.g. in the case of Warsaw
2.8 Sample size
- Starting size sample: To guarantee the sample size of 20,000 individuals a main sample of 20,000 addresses and two additional samples of 20,000 addresses were designed. For each PSU three lists - with five addresses each - were prepared, one primary list - No. 1, and two reserve lists - No. 2 and 3. The exact rules of how to proceed to get a definite number of respondents was precisely described in the survey instruction.
- Aimed total size at Wave 1: 20,000 respondents.
- Aimed total size at Wave 3: no estimation
2.9 Estimated Non-response
- Initial non-response: NA (see below).
- Non response measures: Oversampling - a main sample and two additional samples were selected of 20,000 addresses each. In order to achieve the planned sample size, it was necessary to visit nearly 48,000 sampled addresses.  
- Within household non-response measures: None, the household was marked as a non-response.

WAVE 2 SAMPLING PROCEDURE
1. Sampling frame
1.1 Type of frame: Dwellings list from the National Official Register of Territorial Division of the Country (TERYT).
1.2  Frame coverage: Total population of Poland
1.3 Frame size
1.4 Level of units available: Dwellings/households

2. Sampling method
2.1 Sampling method type: Multistage - for the additional sample of persons aged 18-22. Dwellings for persons aged 18-22 were randomly drawn by use of a two-stage sampling design from LFS sample taking into account the information on age of the household members.
2.2 Sampling stage definition
- PSU: Enumeration census areas of 2011 population census for the additional sample of persons aged 18-22.
- SSU: Dwellings/households.
- TSU: NA.
2.3 Sampling stage size
- PSU: NA.
- SSU: NA.
- TSU: NA.
2.4 Unit selection: Systematic Random Sampling.
2.5 Final stage unit selection: Simple Random Sampling (SRS).
2.6 Within household unit selection: Last birthday method.
2.7 Stratification: Implicit sampling stratification - for the additional sample of person aged 18-22
2.8 Sample size
- Starting size sample: Panel - the list of 19,987 addresses of Wave 1 respondents; New sample: 2,150 addresses of the primary sample and 2,219 addresses of the reserve sample for newly interviewed persons aged 18-22
- Aimed total size: 15,000 respondents.
- Aimed total size at Wave 3:  no estimation.
2.9 Estimated Non-response
- Non response measures: Oversampling of newly interviewed persons aged 18-22.
- Within household non-response measures: None, the household was marked as a non-response.

Mode of Data Collection

Method: Face-to-face
Technique: PAPI

Type of Research Instrument

Structured questionnaire in Polish.

Characteristics of Data Collection Situation

WAVE 1 DATA COLLECTION  
1. Interviewers
1.1 Total number of interviewers: NA
1.2 Number of interviewers in the field: NA
1.3 Network organization: Fieldwork coordinators designated by the Bureau of Research and Statistical Analyses which was responsible for the fieldwork.
1.4 Working arrangement of interviewers: Professional interviewers employed by  the Central Statistical Office (CSO) and experienced in performing CSO surveys  
1.5 Payment of interviewers: Paid per interview.

2. Interviewer training
2.1 General interviewing: Coordinators of the field study and selected interviewers were already trained in general interviewing techniques: interviewing techniques, appointments taking and general knowledge on surveys carried out by CSO.
2.2 Survey specific: In July  2010, coordinators of fieldwork and selected regional trainers of CSO were trained in GGS specific issues. They were responsible to provide a GGS specific training of interviewers in each voivodship.
2.3 Length: 1 day + contacts during the pilot survey. The pilot survey report was prepared and sent to the coordinators.
2.4 Control of performance: Checking the full- filled questionnaires, FAQ was active during the fieldwork.
2.5 Interviewer survey: Yes. It included mainly questions concerning the conditions under which the interview was conducted (e.g., a way of filling in the questionnaire, a presence of other household members, their possible influence on answers received, a general attitude of a respondent).  

3. Contact protocols
3.1 Advance letter:Before a visit of an interviewer each household selected in the sample received a letter with a description of the study (content of the survey, being a part of the GGP, the agencies involved in the survey, the future use of the data, etc). The household was also invited to visit the project website (http://www.sgh.waw.pl/ggs-pl/) to find more information about the survey and the whole project.  
3.2 Cold contacts: Face-to-face.
3.3 Scheduling / scattering: In the case of short-term absence of the selected respondent, the interview could have been postponed to the next date. In case of long term absence of the selected repondent, the interview could not be completed and this fact was classified as a non-response.
3.4 Contact history: No, the contact history was recorded only in case of the interview was run (or an attempt to run it).  
3.5 Min number of contacts: 2
3.6 Max number of contacts: 4

4. Questionnaire localization
4.1 Validation: Back translation
4.2 Pre-test: Survey rules used as usual: the first version of the questionnaire was tested in the pilot survey (carried out in August 2010) on a sample of 100 persons.  
After the interview, interviewers were completing a short survey on the quality of the interview. The data is available from ISD upon request. The ISD team members were under disposal of the regional survey coordinators. They were contacted either by phone or via e-mail. FAQ was active during the fieldwork.

WAVE 2 DATA COLLECTION  
1. Interviewers
1.1 Total number of interviewers: NA
1.2 Number of interviewers in the field: NA
1.3 Network organization: Fieldwork coordinators designated by the Bureau of Research and Statistical Analyses which was responsible for the fieldwork.
1.4 Working arrangement of interviewers: Professional interviewers employed by  the Central Statistical Office (CSO) and experienced in performing CSO surveys.
1.5 Payment of interviewers: Paid per interview.

2. Interviewer training:   
2.1 General interviewing: Coordinators of the field study and selected interviewers were already trained in general interviewing techniques: interviewing techniques, appointments taking and general knowledge on surveys carried out by CSO.
2.2 Survey specific: The fieldwork coordinators of CSO were already familiar with the GGS - in July 2014 they received detailed information on new aspects of Wave 2. They were responsible to provide GGS specific training of interviewers in each voivodship.
2.3 Length: NA
2.4 Control of performance: Checking the filled out of questionnaires, FAQ was active during the fieldwork.
2.5 Interviewer survey: Yes. It included mainly questions concerning the conditions under which the interview was conducted (e.g. a way of filling in the questionnaire, a presence of other household members, their possible influence on answers received, a general attitude of a respondent).  

3. Contact protocols
3.1 Advance letter: Before a visit of an interviewer each household selected in the sample received a letter with a description of the study (content of the survey, being a part of the GGP, the agencies involved in the survey, the future use of the data, etc). The household was also invited to visit the project website (http://www.sgh.waw.pl/ggs-pl/) to find more information about the survey and the whole project.  
3.2 Cold contacts: Face-to-face.
3.3 Scheduling / scattering: In Wave 2 new possible reasons for non-interviewing came to place: the address did not exist anymore, the respondent left the household (here a new address was seeking by an interviewer), the respondent died,  the household moved to another place. Therefore, beside the similar rules to contact the respondent as in Wave 1, additional rules were established to reduce attrition.
3.4 Contact history: No, the contact history was recorded only in case of the interview was run (or an attempt to run it).  
3.5 & 3.6 Min and Max number of contacts: The rule to contact a respondent of Wave1 was extended: if a respondent/a household was not located at the Wave 1 address, the new address/a new location had to be established whenever possible. If a respondent moved to another place of residence in Poland, an attempt to contact him was recommended either by the same regional team of interviewers or by the new regional team.   

4. Questionnaire localization
4.1 Validation: Back translation
4.2 Pre-test: Interviews carried out in July 2014: the questionnaire of Wave 2 (respondents in the panel) was tested internally in selected regional statistical offices. The fieldwork coordinators of CSO were responsible for collecting all comments and doubts concerning the questionnaire. After discussing all comments both the survey questionnaire and the survey instruction were adjusted accordingly.
After the interview, interviewers were completing a short survey on the quality of the interview. The data is available from ISD upon request. The ISD team members were under disposal of the regional survey coordinators. They were contacted either by phone or via e-mail. FAQ was active during the fieldwork.

Actions to Minimize Losses

WAVE 1 ACTIONS
1.  Dealing with nonresponse
1.1 Screening: No
1.2 Refusal conversion: As professional interviewers of the CSO they did not need special measures to be implemented. However, during the general training some attention was paid how to introduce  sensitive questions (e.g. regarding fertility, partnership history) to respondents and which arguments might be used to persuade respondents their hesitations.  

2. Tracking of sampled units
2.1 Respondent contact information: Yes. Respondent's address, telephone number and e-mail were collected.
2.2 Other contact information: Yes, from the household members predominantly.
2.3 Cards: No.
2.4 Additional surveys: No
2.5 Administrative records: No

WAVE 2 ACTIONS
1.  Dealing with nonresponse
1.1 Screening: No
1.2 Refusal conversion: Some experience gained during Wave 1 by researchers and interviewers was used to discuss during the general training for Wave 2 how to approach sensitive questions.  

2. Tracking of sampled units
2.1 Respondent contact information: Yes. Respondent's address, telephone number and e-mail were collected.
2.2 Other contact information: Yes. Respondent's address, telephone number and e-mail were either collected or checked.  
2.3 Cards: No.
2.4 Additional surveys: No
2.5 Administrative records: No

Control operations

WAVE 1 CONTROLS  
Some preliminary data control has been performed by the coordinators who supervised questionnaires delivered by principal investigators. They used the control scheme prepared by the ISD team.

WAVE 2 CONTROLS  
Some preliminary data control has been performed by the coordinators who supervised questionnaires delivered by principal investigators. They used the special control scheme prepared by the ISD team.

Weighting

WAVE 1 WEIGHT VARIABLES
The weights based on current estimates of the population structures have been prepared by a person from the CSO involved in the regular surveys run by the CSO. In general, the applied formulas take into account: a probability of dwelling selection, coverage of the survey by voivodship and class of locality, use of the additional samples, and demographic characteristics of the population in private households. Since the 2011 population census (PC 2011) delivered data for new estimates of population structures, the weights proposed for Wave 1 should be considered as preliminary ones. The new set of weights based on the PC 2011 is available.   

Variable aweight will be provided in later update of the dataset.

WAVE 2 WEIGHT VARIABLES
The weights based on the PC 2011 have been prepared by a person from the CSO involved in the regular surveys run by the CSO. In general, the applied formulas take into account: a probability of dwelling selection, coverage of the survey by voivodship and class of locality, use of the additional samples, and demographic characteristics of the population in private households.

Cleaning Operations

The data was cleaned using the consistency and logical checking. In ambiguous situations the original questionnaires were checked.

Response Rate

WAVE 1  
Frequency of final disposition codes:  
I = Complete interview: 18,310
P = Partial interview: 1,677
NE = Not eligible:  NA - due to oversampling (main sample and two  additional samples) and a procedure applied   the detailed data on reasons for non-performing an interview at each address approached by an interviewers are not available  

WAVE 2
Frequency of final disposition codes:  
I = Complete interview: 13423
P = Partial interview: 57
NE = Not eligible: NA
NC = Non-contact: 980
R = Refusal: 2316
O = Other non-response: 1902
UC = Unknown eligibility, contacted: NA
UN = Unknown eligibility, non-contact: NA
eC = Estimated proportion of contacted cases of unknown eligibility that are eligible: NA
eN = Estimated proportion of non-contacted cases of unknown eligibility that are eligible: NA

Completeness of Study Stored

WAVE 1
Data on 19,987 respondents of 20,000 interviewed were coded and stored in the database; 13 interviews contained some data inconsistencies which could not be corrected.

WAVE 2
From 13,603 interviews collected data for 13,480 respondents were coded and stored in the database. The main reason of removing 123 interviews was that a respondent was likely different from a person interviewed in Wave 1, however, data inconsistencies impossible to correct also occured.
The cases beloning to the new Polish W2 sample, which consists of 1186 interviews, have not been included in the harmonized Wave 2 datafile. These cases were asked a mixture of W1 and W2 questions.

Restrictions

In order to access micro data files, users have to sign and submit a Statement of affiliation, confidentiality and acceptable usage. They also have to submit a title and abstract of their research project. They can use the data for all their research projects, except for datasets from Australia, Belgium and Norway. Users of these datasets need to submit a new application form if they want to use the data in a different research project. The access rights from Wave 1 data are NOT transferred to the Wave 2 data. Wave 1 data users need to submit a new application form to gain access to Wave 2 datasets.

Access Authority

Name Affiliation E-mail address Universal Resource Identifier
UNECE Population Unit - Palais des Nations - CH-1211 Geneva 10 - Switzerland. Tel: +41 22 917 24 77 - fax: +41 22 917 01 07 ggp@unece.org http://www.unece.org/pau/

Citation Requirement

In any work emanating from research based on the Generations and Gender Survey micro-data, I will acknowledge that these data were obtained from the GGP Data Archive and refer to the publication that describes the model survey instruments: United Nations 2005. Generations & Gender Programme: SurveyInstruments. New York and Geneva: UN, 2005

Deposit Requirement

Users of GGS micro-data are required to send any research papers based on the Generations and Gender Survey micro-data or aggregate tabulations to the Population Activities Unit of the UN Economic Commission for Europe, for inclusion in the GGP publications archive.

Conditions

In order to access, it is necessary to subscribe to the GGP Data User Space, and to follow the instructions available on the GGP data access webpage.

Disclaimer

The authors and producers bear no responsibility for the uses of the GGS data, or for interpretations or inferences based on these uses. The producers accept no liability for indirect, consequential or incidental damages or losses arising from use of the data collection, or from the unavailability of, or break in access to the service for whatever reason.

Related Materials

National website of GGS in Polish

Poland_Questionnaire_W1_plk

Poland_Questionnaire_W1_en

Poland_Questionnaire_W2_plk

Poland_Cards_W2_plk

Other References Note

Polish country presentations at the GGP International Working Group Meetings

The life of Poles: From leaving the parental home to retirement. Insights from the Generations and Gender Survey (GGS-PL).

Irena E. Kotowska,Anna Matysiak,Monika Mynarska (Eds.). Warsaw School of Economics Collegium of Economic Analysis Institute of Statistics and Demography.

Data Files Description

File Name

GGS_Wave1_Poland_V.4.3..NSDstat

Contents of Files

VARIABLES HAVING ALL SYSTEM MISSING CASES ARE DROPPED BEFORE PUBLICATION IN NESSTAR.
This is the reason why the total no. of variables in the Nesstar data file is smaller than the total number of variables in the SPSS and STATA files.

Variables are ordered according to the sections of the GGS codebook: Household, Children, Partnerships, Household Organisation and Partnership Quality, Parents and Parental Home, Fertility, Health and Well-Being, Respondent's Activity and Income, Partner's Activity and Income, Household Possessions, Income and Transfers, Value Orientations and Attitudes, Interviewers' report.
The variables begin with a letter designating the wave of data collection ("a" for the first wave likewise "b" for the second wave). We have attempted to keep the names of variables the same across the waves, and all the new variables would be identified as follows ["wave letter"]n e.g.  bn301.  
Although we encourage the countries to strictly follow the GGS Questionnaire, countries might implement a question that differs to a considerable extent from the GGS Questionnaire. In this case either we add country specific response values, or we introduce a country specific variable.  
Country specific values are added when the question follows the model questionnaire, but the answers are not at all or partly compatible. They are at least 4 digits long (F4 format) and begin with the country code: e.g., Australia 2401. Hence, the country code, as an example, for Australia is 24.  
A country specific variable is introduced when the question differs from the model questionnaire albeit measuring the same concept. This kind of variables is identified with a suffix given by the country code plus a number, e.g., Australia a119_2401.
In order to have an overview of GGS country code, please refer to the variable "acountry".

File Structure

Record Group

Overall Case Count

19987

Overall Variable Count

1064

Type of File

Nesstar 200801

Extent of Processing Checks

DATA HARMONISATION
The data is submitted in an already pre-harmonised form. It is prepared and organised according to the GGS standards.  
Harmonisation aims at achieving a clear and comparable format of the GGS micro-data files that would be adequate for cross-country comparison.  The harmonisation procedure basically is composed of:
1. Label checks  
This step makes sure that all the variables are named the same across the countries and refer to a particular question in the GGS Questionnaire. Also the value labels are checked. They should be the same across GGS datasets.  
2. Dealing with grids
The GGS Questionnaire holds several grids of either event history information or members of the household. Such data needs to be harmonized with specific attention to order and logical consistency of grid-rows (be either household members or events such as births). In data sense each row of the grid is represented by variable name followed by a subscripted number ("_#"). Each subscript thus represents one household member or one event. Part of the grid harmonization is grid sorting. Grid rows are sorted according to pre-defined key. For example in the household grid, the household members are sorted according to their relationship to the respondent i.e. the relation to respondent variable (ahg3_# or bhg3_# ). Respondents would appear, first, followed by their partners and children if any and then followed by other household members. As there may be more then one child (or other relative) living in the household they also would need to be sorted. In the case of the household grid, age is used as the secondary sorting key (starting with the oldest person to the youngest).
3. Routing
Routing check ensures that the structure of underlying data set matches the structure of the GGS questionnaire. Its main goal is to code any given variable in the dataset to either a valid response, nonresponse or skip as indicated in the questionnaire. Consequently, the indicated skip in the quetionnaire is represented with a system missing code (. in STATA, sysmis in SPSS), while the missing information for other reasons is coded into non-applicable/no response (i.e. codes 7, 8, 9 in SPSS or .a, .b, .c in STATA).  
4. Consolidation  
The process consolidates the information scattered over several variables into a single one. The consolidation procedure is carried out in the Children Section, the Partnership Section and the Parents and Parental Home Section.
5. Imputation  
Due to its sensitive nature, the respondents are reluctant to share income information with the interviewer. In order to be able to use income information in a cross country comparative study and not to loose too many observations in the process it is necessary to impute the approximately correct distribution of the income variable in each country.  
6. Calculation of derived variables
We calculate derived variables out of the following variables:
- grid variables (i.e., household grid, children grid, and partnership history grid); the codebook starts with the constructed variables that sum the key socio-demographic characteristics of the respondent.
- month and year variables,  
- hours and minutes variables,
- frequency and unit variables.  
Occupation variables are recoded into ISCO-88 1 digit.
Explanations of the ways in which consolidated and derived variables are obtained, are available under the field "Note" of the "Variable Description" sections.
For a more detailed and technical procedure please refer to the Data Cleaning and Harmonisation Guidelines.

Missing Data

The following missing values have been assigned:
- 6, 96, 996, etc. = Unknown (only for consolidated variables in the group "administrative variables")
- 7, 97, 997, etc. = Don't know
- 8, 98, 998, etc. = Refusal
- 9, 99, 999, etc. = Not-applicable/no response
For further information, see the GGS Wave 1 questionnaire manual: http://www.ggp-i.org/materials/survey-instruments.html.

Version

Harmonized dataset, GGS Wave1, version 4.3.

Notes

IMPROVEMENTS INTRODUCED WITH V.4.3. (August 2016):
Variables corrected with Version 4.3.
- fertintent (no more ambiguous labelling)
- a1101 (corrected error in coding)
- aweight (now available also for NLD CZE SWE POL)
- aregion (now available also for HUN)
- aplace (now available also for HUN)
- a5112 (corrected routing error for ROU)
- a5113 (corrected routing error for ROU)
- a5114 (corrected routing error for ROU)
- a5115 (corrected routing error i for ROU)
- a211b_ (corrected error for POL & GEO)
- ankids (corrected error for POL & GEO)
- a1008mnth (corrected error for NGR & BEL)
- a108 (now available for SWE)
- a109_1 (now available for SWE)
- a109_2 (now available for SWE)
- a149 (now available for SWE)
- a309 (now available for SWE)
- aregion (now available for SWE)
- a620_ (corrected error for DEU & CZE)
- a402 (corrected error for POL)
- a149 (corrected error routing error in NOR)
- a344 (corrected error routing error in NOR)
- a256_ (corrected error for POL & GEO)

IMPROVEMENTS INTRODUCED WITH V.4.2. (February 2014):
The update from v4.1 to v4.2 does not include corrections of existing variables.  
The update only includes additional variables which are derived from the pre-existing datasets
- Variables derived from grid variables and variables which concern the respondents and his/her partner: numdissol numdivorce nummarriage numpartners livingwithpartner childprevp femage maleage femeduc maleeduc fertintent numbiol numres numnonres numstep numallchild ageyoungest ageoldest numrespleave numotherparentleave coreschild coresparen coresgrandp coressibl.
- Variables derived from month and year variable:  
a808Dur a822Dur a907Dur a911Dur a914Dur; a303cAgeP a315AgeP a316cAgeP a374cAgeP a608AgeP a610AgeP a617bAgeP a621AgeP a914AgeP a941AgeP; a107AgeR a121AgeR a150AgeR a239aAgeR a239bAgeR a240AgeR a301AgeR a302bAgeR; a311AgeR a314bAgeR a314dAgeR a371AgeR a372bAgeR a603AgeR a608AgeR a610AgeR a613AgeR a614AgeR a619AgeR a621AgeR a816AgeR a822AgeR a871AgeR a5116AgeR a5117bAgeR; a302bTdiff a314bTdiff a314dTdiff a372bTdiff.
- Variables derived from hours and minutes variables: a324_hour a520_hour a540_hour.
- Variables derived from frequency and unit variables: a205mnth,a241mnth,a325mnth,a355mnth,a359mnth,a363mnth,a367mnth,a521mnth,a541mnth,a1008mnth,a1102mnth; a203c_?w a204c_?w.
- Occupation variables recoded into ISCO-88 1 digit: a828_1dig a832_1dig a861_1dig a917_1dig a921_1dig a933_1dig a5112_1dig a5114_1dig.

FIRST DATASET RELEASED: V. 4.1 (October 2013).

Notes

Before publication in Nesstar GGS micro data files are further processed so as to ease online data browsing and analysing.
We delete variables having all system missings.

File Name

GGS_Wave2_Poland_V.1.3..NSDstat

Contents of Files

This study includes the consolidated GGS Wave 2 datasets. It contains all the GGS Wave 2 released datasets, except for Germany Turkish-subsample. All the available variables are included (also the country specific variables).

VARIABLES HAVING ALL SYSTEM MISSING CASES ARE DROPPED BEFORE PUBLICATION IN NESSTAR.
This is the reason why the total no. of variables in the Nesstar data file is smaller than the total number of variables in the SPSS and STATA files.

Variables are ordered according to the sections of the GGS codebook: Household, Children, Partnerships, Household Organisation and Partnership Quality, Parents and Parental Home, Fertility, Health and Well-Being, Respondent's Activity and Income, Activity and Education History, Partner's Activity and Income, Household Possessions, Income and Transfers, Value Orientations and Attitudes, Interviewers' report.
The variables begin with a letter designating the wave of data collection ("a" for the first wave likewise "b" for the second wave). We have attempted to keep the names of variables the same across the waves, and all the new variables would be identified as follows ["wave letter"]n e.g.  bn301.  
Although we encourage the countries to strictly follow the GGS Questionnaire, countries might implement a question that differs to a considerable extent from the GGS Questionnaire. In this case either we add country specific response values, or we introduce a country specific variable.  
Country specific values are added when the question follows the model questionnaire, but the answers are not at all or partly compatible. They are at least 4 digits long (F4 format) and begin with the country code: e.g., Australia 2401. Hence, the country code, as an example, for Australia is 24.  
A country specific variable is introduced when the question differs from the model questionnaire albeit measuring the same concept. This kind of variables is identified with a suffix given by the country code plus a number, e.g., Australia a119_2401.
In order to have an overview of GGS country code, please refer to the variable "acountry".

File Structure

Record Group

Overall Case Count

12294

Overall Variable Count

1135

Type of File

Nesstar 200801

Place of File Production

The file is produced centrally by the Netherlands Interdisciplinary Demographic Institute (NIDI, The Netherlands), in collaboration with the Survey Department of the "Institut national d'études démographiques" (INED, France).

Extent of Processing Checks

DATA HARMONISATION
The data is submitted in an already pre-harmonised form. It is prepared and organised according to the GGS standards.  
Harmonisation aims at achieving a clear and comparable format of the GGS micro-data files that would be adequate for cross-country comparison.  The harmonisation procedure basically is composed of:
1. Label checks  
This step makes sure that all the variables are named the same across the countries and refer to a particular question in the GGS Questionnaire. Also the value labels are checked. They should be the same across GGS datasets.  
2. Dealing with grids
The GGS Questionnaire holds several grids of either event history information or members of the household. Such data needs to be harmonized with specific attention to order and logical consistency of grid-rows (be either household members or events such as births). In data sense each row of the grid is represented by variable name followed by a subscripted number ("_#"). Each subscript thus represents one household member or one event. Part of the grid harmonization is grid sorting. Grid rows are sorted according to pre-defined key. For example in the household grid, the household members are sorted according to their relationship to the respondent i.e. the relation to respondent variable (ahg3_# or bhg3_# ). Respondents would appear, first, followed by their partners and children if any and then followed by other household members. As there may be more then one child (or other relative) living in the household they also would need to be sorted. In the case of the household grid, age is used as the secondary sorting key (starting with the oldest person to the youngest).
3. Routing
Routing check ensures that the structure of underlying data set matches the structure of the GGS questionnaire. Its main goal is to code any given variable in the dataset to either a valid response, nonresponse or skip as indicated in the questionnaire. Consequently, the indicated skip in the quetionnaire is represented with a system missing code (. in STATA, sysmis in SPSS), while the missing information for other reasons is coded into non-applicable/no response (i.e. codes 7, 8, 9 in SPSS or .a, .b, .c in STATA).  
4. Consolidation  
The process consolidates the information scattered over several variables into a single one. The consolidation procedure is carried out in the Children Section, the Partnership Section and the Parents and Parental Home Section.
5. Imputation  
Due to its sensitive nature, the respondents are reluctant to share income information with the interviewer. In order to be able to use income information in a cross country comparative study and not to loose too many observations in the process it is necessary to impute the approximately correct distribution of the income variable in each country.  
6. Calculation of derived variables
We calculate derived variables out of the following variables:
- grid variables (i.e., household grid, children grid, and partnership history grid); the codebook starts with the constructed variables that sum the key socio-demographic characteristics of the respondent.
- month and year variables,  
- hours and minutes variables,
- frequency and unit variables.  
Occupation variables are recoded into ISCO-88 1 digit.
Explanations of the ways in which consolidated and derived variables are obtained, are available under the field "Note" of the "Variable Description" sections.
For a more detailed and technical procedure please refer to the Data Cleaning and Harmonisation Guidelines.

Missing Data

The following missing values have been assigned:
- 6, 96, 996, etc. = Unknown (only for consolidated variables in the group "administrative variables")
- 7, 97, 997, etc. = Don't know
- 8, 98, 998, etc. = Refusal
- 9, 99, 999, etc. = Not-applicable/no response

Version

Harmonized dataset, GGS Wave2, version 1.3.

Notes

IMPROVEMENTS INTRODUCED WITH GGS_Wave2_V.1.3 (August 2016)
Correction of the following variables that were previously  erronous: b343_*, bnnumdissol, bnumdissol, bnnumdivorce, bnumdivorce, bnnummarriage, bnummarriage.

IMPROVEMENTS INTRODUCED WITH GGS_Wave2_V.1.2 (April 2015)
The update from v1.1 to v1.2 does not include corrections of existing variables. The update only includes additional variables which are derived from the pre-existing datasets.  

- Variables derived from grid variables and variables which concern the respondents and his/her partner: bnumdissol, bnnumdissol, bnumdivorce, bnnumdivorce, bnnummarriage, bnummarriage, bnumpartnerships, bnnumpartnerships, bnrespartafterw1, blivingwithpartner, bchildprevp, bnchildprevp, bfemage, bmaleage, bfemeduc , , bmaleeduc, bfertintent, bnumbiol, bnumnonres, bnumres, bnumstep, bnumallchild, bageoldest, bageyoungest, bcoreschild, bcoresgrandp, bcoresparen, bcoressibl , bhhtype.
- Variables derived from month and year variable: b121AgeR, b150AgeR , bn152AgeR, b239aAgeR, b239bAgeR, b240AgeR, bn304Agb303cAgeP, b311AgeR, b315AgeP, b316cAgeP, b371AgeR, b372bAgeR, b372bTdiff, b374cAgeP, b5116AgeR, b5117bAgeR, b603AgeR, b608AgeP, b608AgeR, b610AgeP, b610AgeR, b621AgeP, b621AgeR, b871AgeR, b907Dur, b911Dur, b914AgeP, b914Dur, b941AgeP.
- Variables derived from hours and minutes variables: b324hour, b520hour, b540hour, b221hour_x.
- Variables derived from frequency and unit variables: b203c_xw, b204c_xw, b205mnth, b241mnth, b325mnth, b521mnth, b1008mnth.
- Occupation variables recoded into ISCO-88 1 digit: b828_1dig, b832_1dig, b861_1dig, b917_1dig, b921_1dig, b933_1dig.
- Three groups of variables derived from section no. 8 "Activity and Education History": 1) variables counting the total number of different activity and education situations Rs has had since age 16 (i.e., bnnumworkstatuses, bnnumstudentstatuses, bnnumemplstatuses, bnnumselfemplstatuses, bnnumhelpfamstatuses, bnnumunemplstatuses, bnnumretiredstatuses, bnnummilitarystatuses, bnnumhomestatuses, bnnummatleavestatuses, bnnumparleavestatuses, bnnumdisabilitystatuses, bnnumotherstatuses, bnnum1401, bnnum1501, bnnum1801, bnnum1301, bnnumparttime, bnnumfulltime, bnnumboth, bnnumparttime_1801, bnnumparttime_1802, bnnumpartfulltime_1803, bnnumfulltime_1804, 2) the total duration in month of each of the different situation (i.e., bndurstudentstatuses, bnduremplstatuses, bndurselfemplstatuses, bndurhelpfamstatuses, bndurunemplstatuses, bndurretiredstatuses, bndurmilitarystatuses, bndurhomestatuses, bndurmatleavestatuses, bndurparleavestatuses, bndurilldisabledstatuses, bndurotherstatusstatuses, bndur1501, bndur1401, bndur1301, bndurparttime, bndurlastparttime, bndurstudwhilework), 3) the age of R at the beginning and end of part-time employments (i.e., bn876_xAgeR, bn877_xAgeR, bn878xAgeR, bn879_xAgeR).

The availability of these variables in each different country-specific file depends on the availability of variables used for their calculation.

IMPROVEMENTS INTRODUCED WITH V.1.2. in Nesstar GGS micro data files (April 2015):
Publication of variables that were previously deleted before dataset release in Nesstar. The following variables are concerned: grid variables, month and year variables, hours and minutes variables, frequency and unit variables, and occupation variables.

Notes

Before publication in Nesstar GGS micro data files are further processed so as to ease online data browsing and analysing.
We delete variables having all system missings.  

WAVE 2 DATASETS - Main differences compared to WAVE 1 datasets
Wave 2 datasets include an additional new section that had not been implemented in the Wave 1 data collection. It is the section no. 8 "Activity and Education History". Respondents report comprehensively on their activity and education history since age 16. Two additional sections are also present at the end of wave 2 dataset: "Interviewer observations" and "Interviewer report" (respectively sections no. 13 and 14).
A set of constructed variables at the top of the data file increase the usability of the GGS data by summarizing key socio-demographic characteristics of the respondent (age, birth year, sex, level of educational attainment, activity status, partnership status, number of co-resident partners, number of children, household size, household type). An additional set of variables consolidates information on the current activity of the respondent and his/her partner that is otherwise spread over the questionnaire. Another set of consolidated variables concern respondents' parents and parental home.

WAVE 2 DATASETS - Variables names
Variables in the Wave 2 data sets that are consistent with variables implemented in the Wave 1 questionnaire are named identically. Wave 2 variable names start with the letter "b" compared to letter "a" in Wave 1. Variables that have not been implemented in Wave 1 but collected in Wave 2 begin with "bn".  
In Wave 2 datasets published in Nesstar, the variable "brid - R identification number" has been renamed into "arid" (same variable name than Wave 2). This allows the user to merge Wave 1 and Wave 2 datasets in Nesstar.
In Wave 2 datasets published in Nesstar, variable labels have the indication "(W2)". This allows the user to distinguish Wave 2 variables from Wave 1 variables, on the basis of the variable labels.

Download

Metadata Index

This is the Metadata Index for a Nesstar Server.
Nesstar is a tool used for analysing, visualising and downloading datasets.

Click the "Explore Dataset" button to open the dataset.