Dataset: Generations and Gender Survey 2020 Kazakhstan Wave 1

Abstract

The Generations and Gender Survey (GGS) data provides micro-level data that can be used to investigate partnership dynamics, transition to adulthood, fertility, care and support networks, division of household tasks, and contraception, among other topics. These data are an essential resource in the understanding of fundamental societal challenges across Europe and beyond and form a substantial basis for the formulation of evidence-based policies.  
Key features of the survey are:  
- Cross-national comparability: The comparative focus allows analyses of the ways in which policies, culture and economic circumstances influence dependencies between men and women and between the young and the old.
- A longitudinal design: The GGP survey applies a panel design - collecting information on the same persons at three-year intervals - to allow the examination of causes and consequences of inequalities between genders and generations.  
- A large sample size: The GGP survey has an average of 10,000 respondents per country, making it possible to study numerical minorities and uncommon events.
- A broad age range: The GGP collects data on the whole life course by interviewing respondents aged 18-79. It also enables analysis of multiple generations by asking extensive questions about intergenerational exchange and support.
- The combination of micro and macro data: Alongside the micro data collected via surveys, the GGP has a contextual database with over 100 indicators which cover not only the year of the survey but also retrospective indicators covering the past 40 years to be used alongside the retrospective data in the surveys.
A theory-driven and multidisciplinary questionnaire: The GGS questionnaire is developed and maintained by a team of leading social scientists from demography, sociology and economics. The questionnaire seeks to bring together a wide range of subjects that examine the causes and consequences of family change.

Variable Groups

Document Description

Full Title

Generations and Gender Survey 2020 Kazakhstan Wave 1

Alternative Title

GGS 2020 Kazakhstan Wave 1

Identification Number

GGS2020.W1.34

Date of Distribution

2020-12-02

Version

Version V.1.0 : GGS 2020 - Wave 1

Date: 2020-12-02

Guide To Codebook

In the field “Study Description”, users can find metadata about surveys. This includes the distributors, keywords, abstract, and guidelines on the bibliographic citation.  
Country specific metadata include information on survey producers, methodology and processing. This information was provided by the GGP Central Coordination Team (CCT)  in collaboration with the GGP-country teams, based on a metadata grid with pre-structured questions. Links to relevant references (e.g., working papers and questionnaires) may also provided.  

The field “Data Files Description” provides metadata about the data file, such as file contents, missing values, as well as changes across different GGS versions.

The field "Variable Description" provides information on each variable, such as question text, descriptions of country specific categories and variables, universe (i.e., subset of respondents to whom the question was asked), descriptions of the ways in which consolidated and derived variables are calculated. Variables are ordered according to the sections of the GGS questionnaire.

Full Title

GGS_W1-V.4.3.&W2-V.1.3_France

Producer

Name Affiliation Abbreviation Role
Arianna Caporali Institut national d'études démographiques (INED) AC

Study Description

Full Title

Generations and Gender Survey 2020 Kazakhstan Wave 1

Alternative Title

GGS 2020 Kazakhstan Wave 1

Identification Number

GGS2020.W1.34

Authoring Entity

Name Affiliation
Ainura Dossanova Committee on Statistics, Kazakhstan
Gaziza Moldakulova United Nations Population Fund (UNFPA)

Other identifications and acknowledgments

Name Affiliation Role
Tom Emery NIDI GGP Contact

Producer

Name Affiliation Abbreviation Role
Committee on Statistics, Kazakhstan

Date of Production

2020-09-28

Funding Agency/Sponsor

Name Abbreviation Role Grant
United Nations Population Fund UNFPA

Data Distributor

Name Affiliation Abbreviation
Institut national des études démographiques - 9 cours des Humanités CS 50004, 93322 Aubervilliers Cedex, France. INED
Netherlands Interdisciplinary Demographic Institute - Lange Houtstraat 19, NL-2511 CV The Hague, The Netherlands NIDI

Depositor

Name Affiliation Abbreviation
Committee on Statistics, Kazakhstan United Nations Population Fund (UNFPA)

Bibliographic Citation

CITATION GUIDELINES for the DATASETS: names of the persons in the GGP national team involved in the data collection, plus  members of the GGP Central Coordination Team (CCT) who were involved in the data collection and/or preparation.

Dossanova, A.,  Moldakulova, G., Emery, T., Koops, J., and Caporali, A. (2020). Kazakh Harmonized Generations and Gender Survey-II. Wave 1 (2018). Version 1.
Data obtained from the GGP Data Archive.

METHODOLOGY REFERENCES: We recommend users to cite the original Vikat et al. (2007) paper and a more recent one.

Vikat, A., Spéder, Z., Beets, G., Billari, F., Bühler, C., Désesquelles, A., Fokkema, T., Hoem, J., MacDonald, A., Neyer, G., Pailhé, A., Pinnelli, A., & Solaz, A. (2007). Generations and Gender Survey (GGS): towards a better understanding of relationships and processes in the life course. Demographic Research, 17, 389-440. https://doi.org/10.4054/DemRes.2007.17.14

Gauthier, A. H., Cabaço, S., & Emery, T. (2018). Generations and Gender Survey study profile. Longitudinal and Life Course Studies, 9(4), 456-465. https://doi.org/10.14301/llcs.v9i4.500

List of Keywords

Date of Collection

Start End Cycle
2018-04-02 2018-10-03 Wave 1

Country

Kazakhstan  (KAZ)

Geographic Coverage

The territory of Kazakhstan.

Geographic Unit

GPS positions of all sampling units were collected and stored by the national team. The public use data contains data at the regional level (audandar)

Unit of Analysis

Individuals

Universe

The non-institutionalized, resident population aged 18-79 in Kazakhstan on 2nd April 2018.

Kind of Data

Survey data

Time Method

Panel

Data Collector

Committee on Statistics, Kazakhstan

Sampling Procedure

1. Sampling frame
1.1. Sample Frame:The Kazakhstan Census of 2009 was used as the sampling frame. The frame was subdivided into 16 regions and then further delineated into urban and rural areas.
1.2. Frame Coverage: At least 95%
1.3. Frame Size: 12,073,224 individuals aged 18-79
1.4. Unit of Frame: Residential units

2. Sampling method
2.1. Sampling Method: Probability sampling method. At every selection stage units of selection are made based on the probability proportional to the population size.
2.2. Sampling Stage Definitions: 840 primary Sampling Units (PSU).
2.3. Sampling Stage Sizes: Approximately 20 units per PSU.
2.4. Unit Selection Method: A random number generator was used at all stages of selection.
2.5. Final Stage Unit Selection: SRS (Simple Random Sampling) was applied.
2.6. Within Household Unit Selection: A respondent is selected by the principle of the «next birthday» method.
2.7. Stratification: Regional Stratification and Stratification by urban/rural.
2.8. Sample Size : The starting sample size was 16000 persons. The expected number of respondents in the Wave 1 of data collection was 12500. The expected number of respondents in the Wave 3 of data collection (estimated at the sampling stage) was 8000.
2.9. Estimated Non-Response:  
- Estimated non-response at Wave 1 (includes both non-contacts and refusals) - expressed in proportion of the starting sample: 0,2%
- Estimated yearly attrition - expressed in proportion of the starting sample: 8%
- Non-response measures, i.e., the measures that were foreseen to battle the inevitable non-response: None
- Within household non-response measures: None

Mode of Data Collection

Method: Face-to-Face (personal interview)  
Technique: Computer-assisted personal interviewing (CAPI)

Type of Research Instrument

Structured questionnaire in Russian and Kazakh.
The questionnaire was deployed in Russian and Kazakh. 6% of the questionnaire was fielded in Kazakh. There was an initial error in the coding of the occupation statuses in the Russian version of the questionnaire which led to two separate instruments for the Russian version being deployed. This meant that 1095 (7.37%) of respondents filled in an initial version of the Russian questionnaire. The remaining 86.69% of the respondents filled out the finalized version of the Russian Questionnaire. The difference between these two versions lies solely in the coding of occupational categories which is more detailed in the final version. However for international comparisons, this distinction is of no consequence as the categories are harmonized and consolidated for comparative use.

Characteristics of Data Collection Situation

1.Interviewers
1.1. Total number of interviewers: 211
1.2. Total number of interviewers in the field: 150
1.1.  Network Organization: Field Coordinators were organized at a regional level. Regions were then further broken down into sub-regions based on the geography and settlement types of the region.  
1.2.  Working arrangement of Interviewers: Full-time
1.3.  Payment of interviewers: Interviewers were paid per interview

2. Interviewer Training
2.1.  General Interviewing: Interviewers were predominantly professional, full time interviewers with experience in fielding large scale surveys. They had training in interviewer conduct, respondent selection and contact management.
2.2.  Survey Specific: Training was given on the specific questionnaire and the highly sensitive nature of several questions were addressed and how to handle these in the field. Interviewers were also given an overview of the content and aims of the survey and specifically the need for detailed and accurate life history data within the Generations and Gender Survey.
2.3.  Length: 2 day training on the Generations and Gender Survey was provided to regional coordinators in Almaty, Kazakhstan and interviewers were then given two day training in regional offices where they were instructed on how to use devices, load in the survey and collect the data.
2.4.  Control of Performance: Yes, the Central Coordination Team at NIDI provided control checks on demographic elements of the data quality and attempted to identify any interviewer misconduct as data was collected. These included checks on general data quality such as item non-response, straight linining and irregular skip patterns as well as thematically important issues such as the number of children reported and whether a spouse was present during the interview. 3% of interviews were called back to verify that the interview took place and that the information provided was correct.
2.5.  Interviewer Survey: No Interviewer Survey was conducted.

3. Contact Protocols
3.1.  Advance Letter: No
3.2.  Cold Contacts: Face to Face
3.3.  Scheduling/Scatter: If contact was not possible on the first attempt then contact attempts were made on different days and at different times until a succesful interview was achieved.
3.4.  Contact History: Yes, the form was collected through the Open Data Kit (https://opendatakit.org/) and included all contact information.
3.5.  Min number of contacts: at least 3 times.
3.6.  Max number of contacts: No.

4. Questionnaire localization
4.1.  Validation: The questionnaire was adapted into both Kazakh and Russian by the team at the Committee of Statistics, with the assistance of the UNFPA. These were back translated and verified agaisnt the GGS core questionnaire.
4.2.  Pre-Test: Yes, pre-testing was conducted using a non-random sample of just over 50 individuals of various ages and backgrounds.
4.3.  Length of Interview: 58.2 minutes

Actions to Minimize Losses

1.  Dealing with nonresponse  
1.1 Screening: Each household was contacted and asked about all resident household members. If there was nobody aged 18-79 in the household, the household was marked as ineligible and no survey was administered. If there was more than one individual aged 18-79 resident in the household then one was selected at random using the next birthday method, irrespective of their current availability.
1.2 Refusal conversion: Information was provided to respondents about the nature of the survey and how the data would be used to improve policy decision making. Contact information from the Committee of Statistics and UNFPA was provided to respondents.
1.3 Incentives: Respondents were provided with a small gift on behalf of the UNFPA which amounted to no more than $3 in value.

2. Tracking of sampled units  
2.1 Respondent contact information: Yes
2.2 Other contact information: Yes, contact details on a third party contact who was currently not residing with the respondent was requested in order to enable follow up at wave 2.
2.3 Cards: To be determined (TBD)
2.4 Additional surveys: TBD
2.5 Administrative records: TBD

Control operations

Soft checks were implemented within the survey itself to screen erroneous values and data cleaning algorithms flagged problematic values once the data was submitted to the central server. These include analysis of specific interviewers and regions. Corrective instructions were issued when problematic behaviours was identified.

Weighting

A post stratifcation weight on age, sex and region is provided in the datafile (aweight)

Cleaning Operations

There are three sets of processing checks that are conducted:
1. In field checks: For improbable values, interviewers are given soft checks to ensure they have not mistyped the response value. These are particularly focused on the timing of events and their association with the respondents age at the time.   
2. National Team Quality Checks: 3% of the interviews are checked by an independent team to ensure that interviewers have completed the interview in accordance with fieldwork guidelines and recorded individuals correctly.
3. Central Quality Checks: The Central Coordination Team of the GGP have a series of data quality checks that are run over the data collected. These attempt to identify any erroneous values not caught by soft checks, assess values of specific interviewers and regions to ensure that there are no systematic or falsified responses included, and test for straight lining or non-random response patterns.

Response Rate

Response rate - Final disposition codes:
- I = Complete interview: 14528
- P = Partial interview: 329
- NE = non-eligible: 175
- NC = non-contact: 1940
- R = refusal: 666
- O = Other non-response: 30

Completeness of Study Stored

Items that could identify respondents or associates were removed, including personal contact information and all references to names and specific job titles.

Restrictions

In order to access micro data files, users have to sign and submit a Statement of affiliation, confidentiality and acceptable usage. They also have to submit a title and abstract of their research project. They can use the data for all their research projects.

Access Authority

Name Affiliation E-mail address Universal Resource Identifier
UNECE Population Unit - Palais des Nations - CH-1211 Geneva 10 - Switzerland. Tel: +41 22 917 24 77 - fax: +41 22 917 01 07 ggp@unece.org http://www.unece.org/pau/

Citation Requirement

In any work emanating from research based on the Generations and Gender Survey micro-data, users shouldl acknowledge that these data were obtained from the GGP Data Archive and refer to the publications that describes the model survey instruments. Authors must specify which datasets are used, including the countries, waves and versions used in analysis.

CITATION GUIDELINES for the DATASETS: names of the persons in the GGP national team involved in the data collection, plus  members of the GGP Central Coordination Team (CCT) who were involved in the data collection and/or preparation.

Dossanova, A.,  Moldakulova, G., Emery, T., Koops, J., and Caporali, A. (2020). Kazakh Harmonized Generations and Gender Survey-II. Wave 1 (2018). Version 1.
Data obtained from the GGP Data Archive.

METHODOLOGY REFERENCES: We recommend users to cite the original Vikat et al. (2007) paper and a more recent one.

Vikat, A., Spéder, Z., Beets, G., Billari, F., Bühler, C., Désesquelles, A., Fokkema, T., Hoem, J., MacDonald, A., Neyer, G., Pailhé, A., Pinnelli, A., & Solaz, A. (2007). Generations and Gender Survey (GGS): towards a better understanding of relationships and processes in the life course. Demographic Research, 17, 389–440. https://doi.org/10.4054/DemRes.2007.17.14

Gauthier, A. H., Cabaço, S., & Emery, T. (2018). Generations and Gender Survey study profile. Longitudinal and Life Course Studies, 9(4), 456-465. https://doi.org/10.14301/llcs.v9i4.500

Deposit Requirement

Users of GGS micro-data are required to submit any research papers based on the Generations and Gender Survey micro-data or aggregate tabulations to the GGP Website.

Conditions

In order to access, it is necessary to subscribe to the GGP Data User Space, and to follow the instructions available on the GGP data access webpage.

Disclaimer

The authors and producers bear no responsibility for the uses of the GGS data, or for interpretations or inferences based on these uses. The producers accept no liability for indirect, consequential or incidental damages or losses arising from use of the data collection, or from the unavailability of, or break in access to the service for whatever reason.

Other References Note

National survey under the Generations and Gender Program in Kazakhstan. Executive Summary. UNFPA 2019

Data Files Description

File Name

GGS2020_Wave1_Kazakhstan_V.1.0..NSDstat

Contents of Files

Variables are ordered according to the sections of the GGS questionnaire used for the GGP 2020 data collection (the second round of GGP data collection). The sections are: Respondent; Partnerships; Household composition, organization and partnership quality; Parents and Parental Home; Network delineation and support; Fertility; Health and Well-Being; Respondents' Activity and Income; Partners' Activity and Income; Household Possessions, Income and Transfers; Value Orientations and Attitudes; Interviewer Observations; Interviewer Report.  

The variables begin with a letter designating the wave of data collection ("a" for the first wave likewise "b" for the second wave). We have attempted to keep the names of variables the same across the waves. However, variable names of the GGP 2020 data collection do not correspond to those of the previous round of GGP: variables having the same names may be linked to different questions.  

Although we encourage the countries to strictly follow the GGS Questionnaire, countries might implement a question that differs to a considerable extent from the GGS Questionnaire. In this case either we add country specific response values, or we introduce a country specific variable.  
Country specific values are added when the question follows the model questionnaire, but the answers are not at all or partly compatible.They are at least 4 digits long (F4 format) and begin with the country code: e.g., Australia 2401. Hence, the country code, as an example, for Australia is 24.  
A country specific variable is introduced when the question differs from the model questionnaire albeit measuring the same concept. This kind of variables is identified with a suffix given by the country code plus a number, e.g., Australia a119_2401.
In order to have an overview of GGS country codes, please refer to the variable "acountry".

Overall Case Count

14857

Overall Variable Count

1429

Type of File

Nesstar 200801

Place of File Production

The file is produced centrally by the Netherlands Interdisciplinary Demographic Institute (NIDI, The Netherlands), in collaboration with the Survey Department of the "Institut national d'études démographiques" (INED, France).

Extent of Processing Checks

There are three sets of processing checks that are conducted:
1. In field checks: For improbable values, interviewers are given soft checks to ensure they have not mistyped the response value. These are particularly focused on the timing of events and their association with the respondents age at the time.  
2. National Team Quality Checks: 10% of the interviews are checked by an independent team to ensure that interviewers have completed the interview in accordance with fieldwork guidelines and recorded individuals correctly.
3. Central Quality Checks: The Central Coordination Team of the GGP have a series of data quality checks that are run over the data collected. These attempt to identify any erroneous values not caught by soft checks, assess values of specific interviewers and regions to ensure that there are no systematic or falsified responses included, and test for straight lining or non-random response patterns.

Missing Data

The following missing values have been assigned:
- 999997 = Don't know/Reporting Error
- 999998 = Refusal
- 999999 = Not-applicable/no response
The value “.d reporting error” in STATA is re-coded as “9” in SPSS (and here in this file) and merged to the cases having .c in stata.

Version

Harmonized dataset, GGS 2020 Wave 1 Kazakhstan, V.1.0

Download

Metadata Index

This is the Metadata Index for a Nesstar Server.
Nesstar is a tool used for analysing, visualising and downloading datasets.

Click the "Explore Dataset" button to open the dataset.