- 1 Introduction
- 2 Cleaning of time use diary data
- 3 Report on Adapted PPVT-III and 'Who Am I?'
- 4 Imputations to solve missing data problems in Wave 2.5
- 5 Review of main educational program of 4-5 year olds
- 6 Cleaning of income data
- 7 Height differences
- 8 Data issues in Wave 3.5
- 9 Data issues in Wave 4
- 10 Data issues in Wave 5
- 11 Smoking inside the household
- 12 Missing data for Wave 6 items
- 13 Issues with breadwinner questions
- 14 Date of birth corrections
- 15 Minor changes for weight, BMI & and height percentiles and z-scores
- 16 Body fat percentage data corrections
- 17 Wave 4 salary and wages
- 18 Study children allergies (issues with Wave 6 and 7 data)
- 19 After school care issue Wave 7 B cohort
- 20 Who is mother/father issue
- 21 Repeated a year level issue
- 22 Executive functioning - CogState - missing data Wave 7
- 23 Expected/received child support per child
- 24 Reason for change in education institution - SC CAI 6.5 (pc44c3b1):
- 25 Child support - parent living elsewhere PLE 20.8 (pe21p5)
- 26 Informant indicator in LSAC variable naming convention: Approach in Wave 7 and subsequent Waves
- 27 Desired occupation sequencing issue
- 28 Inconsistent placement of SC question
- 29 Difference in health status of household members across waves of LSAC
- 30 Academic Rating Scale score in Wave 7
- 31 Gambling data inconsistencies
- 32 References
- Appendix A: Item-person map
- Appendix B: Principal component analysis
10 Data issues in Wave 5
The first four waves of LSAC data included geography items such as postcodes and various levels of the Australian Standard Geographical Classification (ASGC) that were generated from geocoding of the residential addresses of study families. In Wave 1, the geocodes were based on global positioning system (GPS) coordinates obtained by I-view interviewers at the time of interview, while in Waves 2-4, they were based on residential addresses collected by Australian Bureau of Statistics (ABS) interviewers.
In July 2011, the ABS introduced a new statistical geography framework called the Australian Statistical Geography Standard (ASGS) to replace the ASGC. The main purpose of the ASGS is to disseminate geographically classified statistics. It provides a common framework of statistical geography enabling the publication of statistics that are comparable and spatially integrated.
Improved data sources and technology have allowed the ABS the opportunity to create a better geography optimised for the release of ABS statistics. A new robust and stable structure means that changes over time are minimised, assisting in the maintenance of quality time-series data. In addition, the ASGS, together with improved methods of calculation, allows for more accurate correspondences to translate ABS data to non-ABS administrative and geographic regions.
For further information on this new standard refer to 1270.0.55.001 - Australian Statistical Geography Standard (ASGS): Volume 1 - Main Structure and Greater Capital City Statistical Areas, July 2011.
To take advantage of this more comprehensive, flexible and consistent way of defining Australia's statistical geography, the ASGS is included on LSAC releases from Wave 5 onwards. To ensure that there is a common geographical standard across waves, the decision was made to:
- dual-code Wave 5 residential addresses to ASGC and ASGS, enabling the comparison of old and new classifications; and
- back-code Waves 1-4 residential addresses to the new standard ASGS.
The new variables added to the general release file for each wave are shown in Table 15.
Most addresses were auto-coded using ASGS address coders, which link addresses to geographical areas. However, in some cases, addresses were either incomplete, had spelling errors or, more rarely, were identical addresses in the same suburb. In these cases, addresses were manually cleaned to reduce the number of records with missing geocodes. After these steps, there were still some records unable to be geocoded to ASGS (level SA2). These numbers for Waves 1-5 are provided in Table 16.
|Wave||Number of responding records not coded to SA2|
To enable coding to the ASGS, many addresses needed cleaning to ensure accurate data. As a result, some records have SLAs where there were none previously, and others have been coded to a different SLA.
The 2011 Census and SEIFA data are available in the new ASGS classifications. However, while it is possible to provide ASGS classifications for Waves 1-5, census and SEIFA data for 2001 and 2006 are not available for these new geographic classifications (ASGS).
From Wave 7 onwards only ASGS geography variables will be output on the files.
LSAC data include variables for the occupation of Parent 1 (P1) and Parent 2 (P2). In recent waves, the occupation of Parent Living Elsewhere (PLE) and the parents of P1/P2/PLE (i.e. the study child's grandparents) are also included. These were coded using the Australian Standard Classification of Occupations (ASCO). The ANU4 scale - a scale of occupational status calculated using ASCO, which is an occupational classification system that classifies jobs according to skill level and skill specialisation - is also provided to data users for Waves 1-4.
Since Wave 2, LSAC occupation data have also been coded to the newer occupation standard, which is the Australian and New Zealand Standard Classification of Occupations (ANZSCO). ANZSCO was introduced in 2006 and was a product of a development program between the ABS, Statistics New Zealand and the Australian Government Department of Employment and Workplace Relations.
For further information on this standard, refer to 1220.0 - ANZSCO - Australian and New Zealand Standard Classification of Occupations, First Edition, 2006.
The latest release of ASCO was in 1997, reducing its applicability to the current Australian workforce. Therefore, from Wave 5 onwards only, ANZSCO codes will be produced. To enable the transition to using ANZSCO, the study has:
- added ANZSCO codes to the Waves 2-4 data files, as these codes were already generated during these waves, and is investigating the possibility of providing ANZSCO for Wave 1 through correction code
- replaced the ANU4 scale from Wave 5 onwards with the Australian Socioeconomic Index 2006 (AUSEI06) (McMillan, Beavis, & Jones, 2009), the latest in the series of occupation status scales developed by the ANU.
- provided AUSEI06 for Waves 2-4, and is investigating the possibility of adding to Wave 1 through a correction code.
The new variables added to the general release file are in Table 17.
|pw08_5||Current occupation (ANZSCO code)|
|pw08_6||Current or most recent occupation (ANZSCO code)|
|pw08_7||Current occupation (AUSIE06 code)|
The SEP variable (Z score for socio-economic position among all LSAC families) has been calculated from Waves 1 to 4 using ASCO classifications. Due to ASCO being unavailable for Wave 5, the SEP variable has not been calculated and hence is not available in the Wave 5 dataset. Further work will be done into ways we can calculate the SEP using the ANZSCO classifications and a new/revised SEP variable may be available in the future.
10.3 ACIR data issue (all waves)
After analysis of the ACIR data previously supplied, it came to light that immunisation rates in LSAC did not reflect national rates. After investigation with the data provider, it was found that data extraction up to Wave 5 had not extracted all the required records. These data have been rectified; however, data users should not use the previous version of the ACIR data.
10.4 Changes to household files
Addition of 'Person Type' to the files
In Wave 5, Person Type (f21a) is available on the Waves 1-5 files for the first time, with a code attached to each household member and wave. This item is derived from information collected in the P1 interview and amended where needed during processing. A list of the person types and a description of each is shown in Table 18.
Changes in relationship to study child information for household members
For Waves 1-4, the household file carried forward the relationship to study child for each member in the household from Wave 1 or the subsequent wave for members entering the household after Wave 1. This means that for an existing household member, the relationship information in the household file is generally the same across waves. In some cases, this will not reflect changes in the relationships within the household. Relationship changes that we know did occur include:
- a step-parent changing to adopted parent
- an unrelated adult changing to step-parent
- a foster sibling changing to adopted sibling.
From the Wave 5 interview onwards the relationship of existing household members to the study child can be updated during the interview for household members present in previous waves.
As a result, from Wave 5 onwards there will be differences in the relationships between study children and household members between waves.
Inclusion of two waves of household data in the PLE person grid
The person grid is a list of people and their demographics associated with the study child, some members may still reside with the study child and others may have left. The Wave 5 parent living elsewhere survey instrument included roll-forward person grid data from Wave 4, so now two waves of household data for ongoing responding PLEs are available. Including Wave 4 details of a PLE's household in the survey instrument enables comparisons of the PLE's household circumstances between waves.
Concordance between people on main and PLE person grids
The concordance between the main household and the PLE's household has been provided for the first time in Wave 5. This enables the identification of who is the same person between the two files, who is on the main file only, and who is on the PLE file only. Table 19 provides a list of variables provided in the concordance file.
|MID5||Wave 5 Main Household Member Number|
|PLEID5||Wave 5 PLE Household Member Number|
|HHTYPE 5||Wave 5 Household Type|
|CHHFLOOP||Wave 5 Combined Household Row Number|
The values for HHTYPE_5 are:
- 0 = Not present at Wave 5
- 1 = Wave 5 main household member only
- 2 = Wave 5 PLE household member only
- 3 = Wave 5 main and PLE household member
- Main household member number 4 was present at Wave 5, and that person was also present at Wave 5 in the PLE household, where they were recorded as member number 3. The variables that link these records will contain the following values: MID5 = 4; PLEID5 = 3, HHTYPE_5 = 3.
- If main household member number 4 was in the main household only at Wave 5, the values would be: MID5 = 4; PLEID5 = -9, HHTYPE_5 = 1.
- If PLE household member number 3 was in the PLE household only at Wave 5, the values would be: MID5 = -9; PLEID5 = 3, HHTYPE_5 = 2.
The values in MID5 and PLEID5 correspond to the member number in the data files, so this will enable you to find demographic information and link it to the files if required.
Child report of whether at school
At the start of both the study child's audio-computer-assisted self-interview (ACASI) module and the face-to-face Child Self-Report K (CSRK) module, the interviewer records whether the study child is attending school, using response options of Yes and No. If the study child does not attend a school, some questions about schooling are not asked. These questions are directly related to the school environment and therefore are not relevant to study children not attending school. Parent 1 is also asked a question about whether the child:
- attends a government school
- attends a Catholic school
- attends an independent or private school
- is not in school.
In total, the number of K cohort children coded as not in school as a result of the P1 interview was 33, whereas from the child interview the combined number was 218. Table 20 demonstrates that there were 191 records where the responses about whether the child was in school conflicted between the two interview components.
Table 21 cross-tabulates possible reasons for the discrepancy against school type, as recorded in the P1 interview for these 189 records. Around 44% of the difference seems to be accounted for by the interview taking place at the weekends or in school holidays.
To improve the quality of reporting in Wave 6, and to clear up any confusion, school attendance was recorded in the same way in both the child interview and the Parent 1 interview. In the child interview the same response categories of government school, Catholic school, independent or private school, and not in school will be provided instead of Yes/No responses. This change is to make it clearer that the study is asking about usual school attendance and not whether school was attended on the current interview date. This point was further highlighted in interviewer training.