Dataset: Generations and Gender Survey Australia Wave 1 & Wave 2

Abstract

The Generations and Gender Survey (GGS) provides micro-level data with the aim of significantly improving the knowledge base for social science and policymaking in Europe and developed countries elsewhere.  
In Europe 2020, the European Union develops a strategy "to help us come out stronger from the crisis and turn the EU into a smart, sustainable and inclusive economy delivering high levels of employment, productivity and social cohesion". The economic crisis affects not only day-to-day decisions, but also fundamental choices at all stages of people's lives:  marriage and childbearing, the combination of employment and caring responsibilities for the young and the old, retirement, housing, and ageing well. The GGS has been developed to provide scientists with high-quality data to contribute scientifically grounded answers to these key policy questions. Survey content focuses on intergenerational and gender relations between people, expressed in care arrangements and the organization of paid and unpaid work. Key feature of the survey are:  
- Cross-national comparability. In each country data is collected on the basis of a common international questionnaire and guidelines about the methodology. Data processing includes central harmonization of national datasets.  
- A broad age range. It includes respondents between the ages of 18 and 80.
- A longitudinal design. It has a panel design, collecting information on the same persons at three-year intervals.  
- A large sample size. It has an average of 9,000 respondents per country at Wave 1.
- A theory-driven and multidisciplinary questionnaire. It provides data for policy relevant research by demographers, economists, sociologists, social policy researchers, social psychologists and epidemiologists. The questionnaire is inspired by the theory of planned behavior.
- Possibility to combine the survey data with macro data provided by the GGP Contextual Database. This combination enables analyses of individuals and families in their cultural, economic, political, social and policy contexts.

Variable Groups

Document Description

Full Title

Generations and Gender Survey Australia Wave 1 & Wave 2

Alternative Title

GGS Australia Wave 1 & Wave 2

Identification Number

GGSW1.W2.24

Date of Distribution

2014-03-07

Version

Working Version: GGS Wave 1 Version 4.3 and GGS Wave 2 Version 1.3.

Update of variable catagories and documentation with the release of Poland Wave 2 Version 1.3.

Date: 2018-02-26

Guide To Codebook

In the field “Study Description”, users can find metadata about surveys. This includes the distributors, keywords, abstract, and guidelines on the bibliographic citation.  
Country specific metadata include information on survey producers, methodology and processing. For Wave 1, this information was provided by the GGP-country team, based on a metadata grid with pre-structured questions. Links to relevant references (e.g., working papers and questionnaires) are also provided. For Wave 2, country specific metadata were provided partly by the GGP-country team based on a structured questionnaire, and partly they were taken from the references listed under “Other References Note”.

The field “Data Files Description” provides metadata about the data file, such as file contents, missing values, as well as changes across different GGS versions.

The field "Variable Description" provides information on each variable, such as question text, descriptions of country specific categories and variables, universe (i.e., subset of respondents to whom the question was asked), country specific deviations to GGS routing, descriptions of the ways in which consolidated and derived variables are calculated. Variables are ordered according to the sections of the GGS codebook.

PLEASE NOTICE THAT WE DOCUMENT ONLY VARIABLES HAVING VALID CASES.  
VARIABLES HAVING ALL SYSTEM MISSING CASES ARE NOT DOCUMENTED.  
This is the reason why the total no. of variables in the documentation is smaller than the total number of variables in the SPSS and STATA files.

Full Title

GGS_W1-V.4.3.&W2-V.1.3_Australia

Producer

Name Affiliation Abbreviation Role
Arianna Caporali Institut national d'études démographiques (INED) AC

Study Description

Full Title

Generations and Gender Survey Australia Wave 1 & Wave 2

Alternative Title

GGS Australia Wave 1 & Wave 2

Parallel Title

Contained in "The Household, Income and Labour Dynamics in Australia Survey"

Identification Number

GGSW1.W2.24

Authoring Entity

Name Affiliation
Mark Wooden Melbourne Institute of Applied Economic and Social Research
Department of Families, Housing, Community Services and Indigenous Affairs (FaHCSIA)
Peter McDonald Australian Demographic and Social Research Institute, The Australian National University

Producer

Name Affiliation Abbreviation Role
Melbourne Institute of Applied Economic and Social Research MIAESR
Peter McDonald Australian Demographic and Social Research Institute, The Australian National University ADSRI

Funding Agency/Sponsor

Name Abbreviation Role Grant
Australian government through the Department of Families, Housing, Community Services and Indigenous Affairs

Data Distributor

Name Affiliation Abbreviation
Institut national des études démographiques - 133 boulevard Davout 75980 Paris Cedex 20, France. INED
Netherlands Interdisciplinary Demographic Institute - Lange Houtstraat 19, NL-2511 CV The Hague, The Netherlands NIDI

Depositor

Name Affiliation Abbreviation
Peter McDonald Australian Demographic and Social Research Institute, The Australian National University ADSRI

Bibliographic Citation

United Nations 2005. Generations & Gender Programme: Survey Instruments. New York and Geneva: UN, 2005.

List of Keywords

Date of Collection

Start End Cycle
2005-08 2006-03
2008-08 2009-02

Country

Australia  (AUS)

Geographic Coverage

It covers the six Australian states and two territories outside the borders of the states, excluding remote and sparsely populated areas.

Geographic Unit

Census Districts

Unit of Analysis

Individuals

Universe

The reference population for the HILDA survey wave 1 was all members aged 15+ of private dwellings in Australia, with the following exceptions:
     • diplomatic personnel of overseas governments, customarily excluded from censuses and surveys;
     • overseas residents in Australia (i.e., persons who had stayed or intended to stay in Australia for less than one year);
     • members of non-Australian defence forces (and their dependents) stationed in Australia; and
     • people living in remote and sparsely populated areas.
To ensure that all members of the in-scope population have the same probability of selection, dwellings that are not primary places of residence (e.g., holiday homes) were also excluded.

All members of the households providing at least one interview in wave 1 form the basis of the panel to be pursued in each subsequent wave.

Kind of Data

Survey data

Notes

The GGS in Australia was "piggybacking" on the Household Income and Labour Dynamics in Australia (HILDA) survey. The HILDA Survey is a panel household survey that started in 2001, with a wave each following year.
- GGS Wave 1 corresponds to HILDA Wave 5 (2005-2006).
- GGS Wave 2 corresponds to HILDA Wave 8 (2008-2009).

Time Method

Panel

Data Collector

Sampling Procedure

SAMPLING PROCEDURE FOR HILDA WAVE 1
1. Sampling frame
1.1 Type of frame: List of geographic units - 1996  Census districts (CDs). This frame excluded CDs that had zero land area (usually used for homeless people or off-shore/migratory people) or were remote or sparely populated.  
1.2  Frame coverage: NA
1.3 Frame size: From  34,422 Census Districts in 1996, CDs defined as remote or sparsely populated were excluded. In addition CDs that had zero land area were also excluded.  A sample of 488 CDs were then selected.
1.4 Level of units available: Dwelling

2. Sampling method
2.1 Sampling method type: Multistage
2.2 Sampling stage definition
  - PSU: Census District  
  - SSU: Dwellings
  - TSU: Households
2.3 Sampling stage size
  - PSU: 448
  - SSU: 12,252  ( 22-34 dwellings  chosen within each census district depending on the expected response and occupancy rates of the area)
  - TSU: 7800
2.4 Unit selection: Census District Selection.The list of 1996 Census Collection Districts (CDs) formed the area-based frame from which 488 CDs were selected.The frame of CDs was stratified by State, and within the five largest States in terms of population, by metropolitan and non-metropolitan regions. The CDs were sampled with probability proportional to their size as measured by the number of dwellings (unoccupied and occupied) recorded in the 1996 Census. To ensure the sample of CDs selected provided good coverage of the CDs in the frame, the CDs were sorted by statistical sub-division and section of State. Within each of these groups, the CDs were sorted into geographical (or serpentine) ordering based on the centroid of the CDs. Using a random start, a systematic selection of CDs was then undertaken by staff at the Melbourne Institute.
2.5 Final stage unit selection: Dwelling selection and Household selection.
     - Dwelling selection: ACNielsen used a specifically trained team of interviewers to visit each selected CD and provide a full listing of the dwellings from which dwellings were selected for the [HILDA] Wave 1 sample. The interviewers followed a predetermined route around the entire CD to list all dwellings they came across. Particular attention was paid to ensure that all dwellings had an equal probability of selection, including granny flats, flats, residential warehouses and battleaxe properties. The actual number of dwellings selected within each CD varied depending on projected variations across CDs in response rates and in occupancy rates. The response rate assumptions were based on ACNielsen's experience with survey response rates within the metropolitan and non-metropolitan areas of each State. The occupancy rates are based on the 1996 Census occupancy rates, with the additional qualification that no more than 10 additional selections could be added to cover the expected number of unoccupied dwellings. Given a targeted average of 16 responding households per CD, this meant the selected sample had to be large enough to generate 23 occupied in-scope dwellings per CD. The average number of selected dwellings was thus 25. The selection of dwellings from those listed occurred as follows: The initial dwelling was selected at random from the list of dwellings in the area. A skip of five in urban areas and two in rural areas was then applied to select the remainder of the dwellings required for the area. This ensured that the cluster of dwellings selected from each CD was sufficiently spread out across the CD while not generating large travel costs. For five of the more remote CDs selected that were extremely large in size, a block selection stage was used so that the entire CD did not have to be listed. For each of these areas, a satellite map which details the buildings in the CD was used to divide the CD into blocks with an approximately equal expected number of dwellings in each. A sample of these blocks was selected for full block listing.
     - Household selection: Where a dwelling contained three or fewer households, all such households were sampled. Where there were four or more households occupying one dwelling, a random sample of three households was obtained. Where there were four or more households occupying one dwelling, all households had to be enumerated at the time of first contact and a random sample of three households obtained.
2.6 Within Household unit selection: All household members aged 15 and over interviewed.
2.7 Stratification: Explicit


AUSTRALIAN GGS WAVE 1  
The sampling procedure above outlined, describes the method used to select the initial HILDA sample. However, the GGS Wave 1 did not come from the first wave of HILDA, but rather from Wave 5. The selection of the sample in Wave 5 would have been determined in large part from the initial sample selection used when HILDA first started. However, some people who participated in Wave 5 of HILDA (Wave 1 of GGS), would have been new people who entered after Wave 1 and who wound not have been exposed to the same sample selection as the respondents who had participated since Wave 1.  
Over time, the Wave 1 household members are followed but the sample is also extended to include:
     • any children born to or adopted by members of the selected households; and
     • new household members resulting from changes in the composition of the original households.
     • a new household member that arrived in Australia for the first time after 2001 (since HILDA Wave 9).
Continuing Sample Members (CSMs) include all members of wave 1 households (including children). Any children born to or adopted by CSMs are also classified as CSMs. Further, all new entrants to a household who have a child with a CSM and any recent immigrants to Australia (arriving after 2001 for the Main Sample, 2011 for the Top-Up Sample) are converted to CSM status35. CSMs remain in the sample indefinitely. All other people who share a household with a CSM in wave 2 or later are considered Temporary Sample Members (TSMs).  
Where the household has moved, split or moved and split, the interviewers and office staff track the CSMs. The CSMs (along with their new household) are then interviewed, where applicable, at their new address or by phone.36 TSMs that split from a household and are no longer part of a household with a CSM are not followed. However, if the TSM is converted to a CSM, then they are followed for interview as any CSM would be.

AUSTRALIAN GGS WAVE 2
The sample is represented by respondents who participated to HILDA Wave 5.

Mode of Data Collection

WAVE 1
• Method: Face-to-Face, Telephone and Self-administered were used. 93.5 per cent of respondents (N=11,932) were interviewed face-to-face, while 6.5 per cent (N=827) were interviewed by telephone. Respondents were also given a self-completion questionnaire to fill in.
• Technique: Paper and Pencil (PAPI).

WAVE 2
• Method: Face-to-Face, Telephone and Self-administered were used. 10.1 per cent of the respondents were interviewed by telephone. Respondents were also given a self-completion questionnaire to fill in.
• Technique: Paper and Pencil (PAPI).

The vast majority of the data were collected though face-to-face interviews. While telephone interviews and assisted interviews were conducted to ensure a high response rate, they are only used as a last resort. Due to the fact that some households moved outside of the areas originally selected across Australia in wave 1 and the desire to interview as many people as possible, more telephone interviews are necessary in later waves.

Type of Research Instrument

Structured questionnaire in English (though a professional interpreter may be present during the interview).

Characteristics of Data Collection Situation

WAVE 1 DATA COLLECTION
1. Interviewers
1.1 Total number of interviewers:  132   
1.2 Number of interviewers in the field: DK.
1.3 Network organization: DK.
1.4 Working arrangement of interviewers: DK.
1.5 Payment of interviewers: DK.

2. Interviewer training
2.1 General interviewing: Yes, with a briefing session covering the aims of the survey, fieldwork procedures, questionnaire content, and strategies to maximise response rates.  
2.2 Survey specific: NA.
2.3 Length: The main briefing period took place over two days. Less experienced interviewers were also provided with an extra days training session focusing on refusal aversion.   
2.4 Control of performance: Yes. Regular monitoring of each interviewer's response rate and progress against fieldwork schedule. Reallocation of workloads of under-performing interviewers to better interviewers, and scrutinisation of work returned by interviewers and provision of feedback especially in relation to the quality of the data collected. Supervisors re-contacted respondents to validate a minimum of 10 per cent of the questionnaires completed by each interviewer. The supervisors validated a selection of questions with the respondent as well as questioning any discrepancies identified in the provided information.
2.5 Interviewer survey: NA.

3. Contact protocols
3.1 Advance letter: A Primary Approach Letter and Newsletter were sent about one month prior the interviewer was supposed to contact the household. The newsletter informed about some results from prior waves of HILDA Survey. Households with new entrants to the survey, were given a New Entrants Brochure, which provided additional information about the study, why they had been asked to participate, and a method to opt out of the study.  
3.2 Cold contacts:  NA.
3.3 Scheduling / scattering: Yes, calls were made on a minimum of five different days, not all of which could be consecutive. A mix of daytime and evening times were used.
3.4 Contact history: Yes.
3.5 Min number of contacts: No.
2.6 Max number of contacts: Yes. The interviewer had up to six calls to make contact and a further six calls to undertake all of the interviews once contact had been made. If a household had to be put into tracking and was found, the initial call allocation to make contact with the household was carried over to the next period of the fieldwork. When following up a household, the interviewer had a total of five calls to finalise the household.

4. Questionnaire localization
4.1 Validation: NA.
4.2 Pre-test: A pilot was carried out in April 2005 (500 respondents).
4.3 Length of interview: The average length of the person interview was 31.7 minutes for a 'continuing sample member' who had also been interviewed at least once in previous HILDA surveys; 37.5 minutes for a 'new sample memember'. There was an additional 30 minutes on average to complete a self-completion questionnaire, and 6 minutes for the household form.

WAVE 2 DATA COLLECTION
1. Interviewers
1.1 Total number of interviewers:  138   
1.2 Number of interviewers in the field: DK.
1.3 Network organization: DK.
1.4 Working arrangement of interviewers: DK.
1.5 Payment of interviewers: DK.

2. Interviewer training
Same as Wave 1 (see above).

3. Contact protocols
Same as Wave 1 (see above).

4. Questionnaire localization
4.1 Validation: NA.
4.2 Pre-test: The questionnaires are developed over the 9-month period prior to the main fieldwork for each wave.
4.3 Length of interview: The average length of the person interview was 35.8 minutes.  
There was an additional 30 minutes on average to complete a self-completion questionnaire, and 6 minutes for the household form.

Actions to Minimize Losses

WAVE 1 ACTIONS
1.  Dealing with nonresponse
- Incentives: a cheque for $25 to each person of the household who participated; plus a bonus $25 in case everybody in the household took part.  
      
2. Tracking of sampled units
2.1 Respondent contact information: Yes, contact details of the respondent were collected. These included work and mobile phone numbers, emails, and new address if the respondent was planning to move in the next 12 months, and knew his/her new address.
2.2 Other contact information: Yes. One to two contacts were obtained from the sample members of people who were most likely to know where they were if they moved. Work, home and mobile phone numbers, email addresses and postal addresses were collected for these contacts.  
2.3 Cards: Yes a follow-up newsletter was sent.
2.4 Additional surveys: DK
2.5 Administrative records: No.

WAVE 2 ACTIONS
Same as Wave 1.

Weighting

The datasets contain two versions of the cross-sectional person weight:  
• The responding person population weight is the cross-section population weight for all people who responded in the relevant wave (i.e. they provided a personal interview).        
• The responding person sample weight is the cross-section responding person population weight rescaled to sum to the number of responding persons in the relevant wave.  
These cross-sectional weights opportunistically include temporary members into the sample (i.e., those people who are part of the sample only because they currently live with a continuing sample member). The underlying probability of selection for these households is amended to account for the various pathways into the household. Following this, non-response adjustments are made which require within-sample modelling of non-response probabilities and benchmarking to known population estimates.

Cleaning Operations

The keyed numerical data were subject to 100 per cent verification (i.e., the data was entered in twice and any discrepancies corrected). The keyed verbatim responses were only entered once as these were only used for coding purposes and any mis-entered data could be easily identified and corrected. During data entry, the data was checked using range, logical and consistency edits. Where necessary the data entry was suspended until the identified problem was resolved.

Response Rate

WAVE 1
Final disposition codes:
I = complete interview: 12,759 persons
P = partial interview: NA
NE = non-eligible : NA
NC = non-contact : NA
R = refusal: For 9,138 households, in 700 cases there was no contact.  Of enumerated individuals, 72 could not be contacted
O = other non-response: For 9,138 households, in 1,313  cases refused.  Of enumerated individual, 598 refused
UC = unknown eligibility, contacted: Of enumerated individuals, 142 had other reasons for non-response
UC = unknown eligibility, non-contact: NA
eC = estimated proportion of contacted cases of unknown eligibility that are eligible: NA
eN = estimated proportion of non-contacted cases of unknown eligibility that are eligible: NA

WAVE 2
Final disposition codes for HILDA WAVE 8 (total no. of houselhold interviewed 7066)
I = complete interview: 12,785 persons
P = partial interview: NA
NE = non-eligible : NA
NC = non-contact : NA
R = refusal: For 9,691 households, in 206 cases there was no contact.
O = other non-response: For 9,138 households, in 459  cases refused.  
UC = unknown eligibility, contacted: 1970 household were not issued to field.
UC = unknown eligibility, non-contact: NA
eC = estimated proportion of contacted cases of unknown eligibility that are eligible: NA
eN = estimated proportion of non-contacted cases of unknown eligibility that are eligible: NA

See HILDA annoual reports for further information on response rates of HILDA survey (http://www.melbourneinstitute.com/hilda/Reports/annual_report.html).

Specific Response rates for GGS Australia will be provided in forthcoming weeks.

Completeness of Study Stored

The GGS core questionnaire was adapted to fit with the Australian context and because it was conducted as part of an existing longitudinal survey, a number of questions were  dropped. Country-specific questions that were already asked within HILDA, have been harmonized to match as closely as possible to GGS.

Restrictions

In order to access micro data files, users have to sign and submit a Statement of affiliation, confidentiality and acceptable usage. They also have to submit a title and abstract of their research project. They can use the data for all their research projects, except for datasets from Australia and Norway. Users of these datasets need to submit a new application form if they want to use the data in a different research project. The access rights from Wave 1 data are transferred to the Wave 2 data.

Access Authority

Name Affiliation E-mail address Universal Resource Identifier
UNECE Population Unit - Palais des Nations - CH-1211 Geneva 10 - Switzerland. Tel: +41 22 917 24 77 - fax: +41 22 917 01 07 ggp@unece.org http://www.unece.org/pau/

Citation Requirement

In any work emanating from research based on the Generations and Gender Survey micro-data, I will acknowledge that these data were obtained from the GGP Data Archive and refer to the publication that describes the model survey instruments: United Nations 2005. Generations & Gender Programme: SurveyInstruments. New York and Geneva: UN, 2005

Deposit Requirement

Users of GGS micro-data are required to send any research papers based on the Generations and Gender Survey micro-data or aggregate tabulations to the Population Activities Unit of the UN Economic Commission for Europe, for inclusion in the GGP publications archive.

Conditions

In order to access, it is necessary to subscribe to the GGP Data User Space, and to follow the instructions available on the GGP data access webpage.

Disclaimer

The authors and producers bear no responsibility for the uses of the GGS data, or for interpretations or inferences based on these uses. The producers accept no liability for indirect, consequential or incidental damages or losses arising from use of the data collection, or from the unavailability of, or break in access to the service for whatever reason.

Related Materials

HILDA Survey webpage

HILDA Wave 5 Questionnaires

HILDA Wave 8 Questionnaires

Australia_Questionnaire_W1 (corresponding to HILDA W5 Household Questionnaire)

Australia_Questionnaire_W2 (corresponding to HILDA W8 Household Questionnaire)

Other References Note

HILDA User Manual

HILDA Statistical Reports

HILDA Survey Annual Reports

HILDA Wave 5 Survey Annual report

HILDA Wave 8 Survey Annual report

HILDA Technical Paper Series

Australian country presentations at the GGP International Working Group Meetings

Data Files Description

File Name

GGS_Wave1_Australia_V.4.3..NSDstat

Contents of Files

GGS Wave 1

VARIABLES HAVING ALL SYSTEM MISSING CASES ARE DROPPED BEFORE PUBLICATION IN NESSTAR.
This is the reason why the total no. of variables in the Nesstar data file is smaller than the total number of variables in the SPSS and STATA files.

Variables are ordered according to the sections of the GGS codebook: Household, Children, Partnerships, Household Organisation and Partnership Quality, Parents and Parental Home, Fertility, Health and Well-Being, Respondent's Activity and Income, Partner's Activity and Income, Household Possessions, Income and Transfers, Value Orientations and Attitudes, Interviewers' report.
The variables begin with a letter designating the wave of data collection ("a" for the first wave likewise "b" for the second wave). We have attempted to keep the names of variables the same across the waves, and all the new variables would be identified as follows ["wave letter"]n e.g.  bn301.  
Although we encourage the countries to strictly follow the GGS Questionnaire, countries might implement a question that differs to a considerable extent from the GGS Questionnaire. In this case either we add country specific response values, or we introduce a country specific variable.  
Country specific values are added when the question follows the model questionnaire, but the answers are not at all or partly compatible. They are at least 4 digits long (F4 format) and begin with the country code: e.g., Australia 2401. Hence, the country code, as an example, for Australia is 24.  
A country specific variable is introduced when the question differs from the model questionnaire albeit measuring the same concept. This kind of variables is identified with a suffix given by the country code plus a number, e.g., Australia a119_2401.
In order to have an overview of GGS country code, please refer to the variable "acountry".

File Structure

Record Group

Overall Case Count

7125

Overall Variable Count

841

Type of File

Nesstar 200801

Extent of Processing Checks

WAVE 1 DATA HARMONISATION
The data is submitted in an already pre-harmonised form. It is prepared and organised according to the GGS standards.  
Harmonisation aims at achieving a clear and comparable format of the GGS micro-data files that would be adequate for cross-country comparison.  The harmonisation procedure basically is composed of:
1. Label checks  
This step makes sure that all the variables are named the same across the countries and refer to a particular question in the GGS Questionnaire. Also the value labels are checked. They should be the same across GGS datasets.  
2. Dealing with grids
The GGS Questionnaire holds several grids of either event history information or members of the household. Such data needs to be harmonized with specific attention to order and logical consistency of grid-rows (be either household members or events such as births). In data sense each row of the grid is represented by variable name followed by a subscripted number ("_#"). Each subscript thus represents one household member or one event. Part of the grid harmonization is grid sorting. Grid rows are sorted according to pre-defined key. For example in the household grid, the household members are sorted according to their relationship to the respondent i.e. the relation to respondent variable (ahg3_# or bhg3_# ). Respondents would appear, first, followed by their partners and children if any and then followed by other household members. As there may be more then one child (or other relative) living in the household they also would need to be sorted. In the case of the household grid, age is used as the secondary sorting key (starting with the oldest person to the youngest).
3. Routing
Routing check ensures that the structure of underlying data set matches the structure of the GGS questionnaire. Its main goal is to code any given variable in the dataset to either a valid response, nonresponse or skip as indicated in the questionnaire. Consequently, the indicated skip in the quetionnaire is represented with a system missing code (. in STATA, sysmis in SPSS), while the missing information for other reasons is coded into non-applicable/no response (i.e. codes 7, 8, 9 in SPSS or .a, .b, .c in STATA).  
4. Consolidation  
The process consolidates the information scattered over several variables into a single one. The consolidation procedure is carried out in the Children Section, the Partnership Section and the Parents and Parental Home Section.
5. Imputation  
Due to its sensitive nature, the respondents are reluctant to share income information with the interviewer. In order to be able to use income information in a cross country comparative study and not to loose too many observations in the process it is necessary to impute the approximately correct distribution of the income variable in each country.  
6. Calculation of derived variables
We calculate derived variables out of the following variables:
- grid variables (i.e., household grid, children grid, and partnership history grid); the codebook starts with the constructed variables that sum the key socio-demographic characteristics of the respondent.
- month and year variables,  
- hours and minutes variables,
- frequency and unit variables.  
Occupation variables are recoded into ISCO-88 1 digit.
Explanations of the ways in which consolidated and derived variables are obtained, are available under the field "Note" of the "Variable Description" sections.
For a more detailed and technical procedure please refer to the Data Cleaning and Harmonisation Guidelines.

Missing Data

The following missing values have been assigned:
- 6, 96, 996, etc. = Unknown (only for consolidated variables in the group "administrative variables")
- 7, 97, 997, etc. = Don't know
- 8, 98, 998, etc. = Refusal
- 9, 99, 999, etc. = Not-applicable/no response

Version

Harmonized dataset, GGS Wave1, version 4.3.

Notes

IMPROVEMENTS INTRODUCED WITH V.4.3. (August 2016):
Variables corrected with Version 4.3.:
- fertintent (no more ambiguous labelling)
- a1101 (corrected error in coding)
- aweight (now available also for NLD CZE SWE POL)
- aregion (now available also for HUN)
- aplace (now available also for HUN)
- a5112 (corrected routing error for ROU)
- a5113 (corrected routing error for ROU)
- a5114 (corrected routing error for ROU)
- a5115 (corrected routing error i for ROU)
- a211b_ (corrected error for POL & GEO)
- ankids (corrected error for POL & GEO)
- a1008mnth (corrected error for NGR & BEL)
- a108 (now available for SWE)
- a109_1 (now available for SWE)
- a109_2 (now available for SWE)
- a149 (now available for SWE)
- a309 (now available for SWE)
- aregion (now available for SWE)
- a620_ (corrected error for DEU & CZE)
- a402 (corrected error for POL)
- a149 (corrected error routing error in NOR)
- a344 (corrected error routing error in NOR)
- a256_ (corrected error for POL & GEO)

IMPROVEMENTS INTRODUCED WITH V.4.2. (February 2014):
The update from v4.1 to v4.2 does not include corrections of existing variables.  
The update only includes additional variables which are derived from the pre-existing datasets
- Variables derived from grid variables and variables which concern the respondents and his/her partner: numdissol,numdivorce,nummarriage,numpartners,livingwithpartner,childprevp,femage,maleage,femeduc,maleeduc,fertintent,numbiol,numres,numnonres,numstep,numallchild,ageyoungest,ageoldest,numrespleave,numotherparentleave,coreschild,coresparen,coresgrandp,coressibl.
- Variables derived from month and year variable: a808Dur a822Dur,a907Dur,a911Dur,a914Dur; a303cAgeP,a315AgeP, a316cAgeP a374cAgeP a608AgeP a610AgeP,a617bAgeP,a621AgeP,a914AgeP,a941AgeP; a107AgeR,a121AgeR,a150AgeR,a239aAgeR,a239bAgeR,a240AgeR,a301AgeR,a302bAgeR,a311AgeR,a314bAgeR,a314dAgeR,a371AgeR,a372bAgeR,a603AgeR,a608AgeR,a610AgeR,a613AgeR,a614AgeR,a619AgeR,a621AgeR,a816AgeR,a822AgeR,a871AgeR,a5116AgeR,a5117bAgeR; a302bTdiff,a314bTdiff,a314dTdiff,a372bTdiff.
- Variables derived from hours and minutes variables: a324_hour a520_hour a540_hour.
- Variables derived from frequency and unit variables: a205mnth,a241mnth,a325mnth,a355mnth,a359mnth,a363mnth,a367mnth,a521mnth,a541mnth,a1008mnth,a1102mnth; a203c_?w a204c_?w.
- Occupation variables recoded into ISCO-88 1 digit: a828_1dig a832_1dig a861_1dig a917_1dig a921_1dig a933_1dig a5112_1dig a5114_1dig.


IMPROVEMENTS INTRODUCED WITH V.4.1. (April 2012):
- Variables corrected: aeduc amarstat anpartner ankids$
- Variables with correct names now: aweight_*

IMPROVEMENTS INTRODUCED WITH V.4.0 (March 2012):
- New constructed variables: asex aage abyear aeduc aactstat aparstat amarstat anpartner ankids ahhsize ahhtype.
- New consolidated variables on respondents' current activity: a870, a871m, a871y, a874, a875.
- New consolidated variables on respondents' partners current activity: a940, a941m, a941y.
- Variables corrected: a601 and a602 (corrected, with consequences on the response rate of subsequent variables), a622, a623, a624, a625, a626, a627*, a628*, a629*, a631*, a383 (now rounded)

FIRST DATASET RELEASED: V. 3.0 (January 2012).

File Name

GGS_Wave2_Australia_V1.3..NSDstat

Contents of Files

GGS Wave 2

VARIABLES HAVING ALL SYSTEM MISSING CASES ARE DROPPED BEFORE PUBLICATION IN NESSTAR.
This is the reason why the total no. of variables in the Nesstar data file is smaller than the total number of variables in the SPSS and STATA files.

Variables are ordered according to the sections of the GGS codebook: Household, Children, Partnerships, Household Organisation and Partnership Quality, Parents and Parental Home, Fertility, Health and Well-Being, Respondent's Activity and Income, Partner's Activity and Income, Household Possessions, Income and Transfers, Value Orientations and Attitudes, Interviewers' report.
The variables begin with a letter designating the wave of data collection ("a" for the first wave likewise "b" for the second wave). We have attempted to keep the names of variables the same across the waves, and all the new variables would be identified as follows ["wave letter"]n e.g.  bn301.  
Although we encourage the countries to strictly follow the GGS Questionnaire, countries might implement a question that differs to a considerable extent from the GGS Questionnaire. In this case either we add country specific response values, or we introduce a country specific variable.  
Country specific values are added when the question follows the model questionnaire, but the answers are not at all or partly compatible. They are at least 4 digits long (F4 format) and begin with the country code: e.g., Australia 2401. Hence, the country code, as an example, for Australia is 24.  
A country specific variable is introduced when the question differs from the model questionnaire albeit measuring the same concept. This kind of variables is identified with a suffix given by the country code plus a number, e.g., Australia b119_2401.
In order to have an overview of GGS country code, please refer to the variable "bcountry".

File Structure

Record Group

Overall Case Count

6143

Overall Variable Count

540

Type of File

Nesstar 200801

Extent of Processing Checks

WAVE 2 DATA HARMONISATION: see "Extent of Processing Checks" "WAVE 1 DATA HARMONISATION".

Missing Data

The following missing values have been assigned:
- 6, 96, 996, etc. = Unknown (only for consolidated variables in the group "administrative variables")
- 7, 97, 997, etc. = Don't know
- 8, 98, 998, etc. = Refusal
- 9, 99, 999, etc. = Not-applicable/no response

Version

Harmonized dataset, GGS Wave2, version 1.3.

Notes

IMPROVEMENTS INTRODUCED WITH GGS_Wave2_V.1.3 (August 2016)
Correction of the following variables that were previously  erronous: b343_*, bnnumdissol, bnumdissol, bnnumdivorce, bnumdivorce, bnnummarriage, bnummarriage.

IMPROVEMENTS INTRODUCED WITH GGS_Wave2_V.1.2 (April 2015)
The update from v1.1 to v1.2 does not include corrections of existing variables. The update only includes additional variables which are derived from the pre-existing datasets.  

- Variables derived from grid variables and variables which concern the respondents and his/her partner: bnumdissol, bnnumdissol, bnumdivorce, bnnumdivorce, bnnummarriage, bnummarriage, bnumpartnerships, bnnumpartnerships, bnrespartafterw1, blivingwithpartner, bchildprevp, bnchildprevp, bfemage, bmaleage, bfemeduc , , bmaleeduc, bfertintent, bnumbiol, bnumnonres, bnumres, bnumstep, bnumallchild, bageoldest, bageyoungest, bcoreschild, bcoresgrandp, bcoresparen, bcoressibl , bhhtype.
- Variables derived from month and year variable: b121AgeR, b150AgeR , bn152AgeR, b239aAgeR, b239bAgeR, b240AgeR, bn304Agb303cAgeP, b311AgeR, b315AgeP, b316cAgeP, b371AgeR, b372bAgeR, b372bTdiff, b374cAgeP, b5116AgeR, b5117bAgeR, b603AgeR, b608AgeP, b608AgeR, b610AgeP, b610AgeR, b621AgeP, b621AgeR, b871AgeR, b907Dur, b911Dur, b914AgeP, b914Dur, b941AgeP.
- Variables derived from hours and minutes variables: b324hour, b520hour, b540hour, b221hour_x.
- Variables derived from frequency and unit variables: b203c_xw, b204c_xw, b205mnth, b241mnth, b325mnth, b521mnth, b1008mnth.
- Occupation variables recoded into ISCO-88 1 digit: b828_1dig, b832_1dig, b861_1dig, b917_1dig, b921_1dig, b933_1dig.
- Three groups of variables derived from section no. 8 "Activity and Education History": 1) variables counting the total number of different activity and education situations Rs has had since age 16 (i.e., bnnumworkstatuses, bnnumstudentstatuses, bnnumemplstatuses, bnnumselfemplstatuses, bnnumhelpfamstatuses, bnnumunemplstatuses, bnnumretiredstatuses, bnnummilitarystatuses, bnnumhomestatuses, bnnummatleavestatuses, bnnumparleavestatuses, bnnumdisabilitystatuses, bnnumotherstatuses, bnnum1401, bnnum1501, bnnum1801, bnnum1301, bnnumparttime, bnnumfulltime, bnnumboth, bnnumparttime_1801, bnnumparttime_1802, bnnumpartfulltime_1803, bnnumfulltime_1804, 2) the total duration in month of each of the different situation (i.e., bndurstudentstatuses, bnduremplstatuses, bndurselfemplstatuses, bndurhelpfamstatuses, bndurunemplstatuses, bndurretiredstatuses, bndurmilitarystatuses, bndurhomestatuses, bndurmatleavestatuses, bndurparleavestatuses, bndurilldisabledstatuses, bndurotherstatusstatuses, bndur1501, bndur1401, bndur1301, bndurparttime, bndurlastparttime, bndurstudwhilework), 3) the age of R at the beginning and end of part-time employments (i.e., bn876_xAgeR, bn877_xAgeR, bn878xAgeR, bn879_xAgeR).

The availability of these variables in each different country-specific file depends on the availability of variables used for their calculation.  

IMPROVEMENTS INTRODUCED WITH V.1.2. in Nesstar GGS micro data files (April 2015):
Publication of variables that were previously deleted before dataset release in Nesstar. The following variables are concerned: grid variables, month and year variables, hours and minutes variables, frequency and unit variables, and occupation variables.

FIRST DATASET RELEASED: V. 1.1. (March 2014).

Notes

WAVE 2 DATASETS - Main differences compared to WAVE 1 datasets
Wave 2 datasets include an additional new section that had not been implemented in the Wave 1 data collection. It is the section no. 8 "Activity and Education History". Respondents report comprehensively on their activity and education history since age 16. Two additional sections are also present at the end of wave 2 dataset: "Interviewer observations" and "Interviewer report" (respectively sections no. 13 and 14).
A set of constructed variables at the top of the data file increase the usability of the GGS data by summarizing key socio-demographic characteristics of the respondent (age, birth year, sex, level of educational attainment, activity status, partnership status, number of co-resident partners, number of children, household size, household type). An additional set of variables consolidates information on the current activity of the respondent and his/her partner that is otherwise spread over the questionnaire. Another set of consolidated variables concern respondents' parents and parental home.

WAVE 2 DATASETS - Variables names
Variables in the Wave 2 data sets that are consistent with variables implemented in the Wave 1 questionnaire are named identically. Wave 2 variable names start with the letter "b" compared to letter "a" in Wave 1. Variables that have not been implemented in Wave 1 but collected in Wave 2 begin with "bn".  
In Wave 2 datasets published in Nesstar, the variable "brid - R identification number" has been renamed into "arid" (same variable name than Wave 2). This allows the user to merge Wave 1 and Wave 2 datasets in Nesstar.
In Wave 2 datasets published in Nesstar, variable labels have the indication "(W2)". This allows the user to distinguish Wave 2 variables from Wave 1 variables, on the basis of the variable labels.

Download

Metadata Index

This is the Metadata Index for a Nesstar Server.
Nesstar is a tool used for analysing, visualising and downloading datasets.

Click the "Explore Dataset" button to open the dataset.