Data user guide
- 1. Introduction
- 2. What is LSAC?
- 3. Instruments
- 4. The LSAC data release
- 5. File structure
- 6. Variable naming conventions
- 7. Documentation
- 8. Data transformations
- 9. Confidentialisation
- 10. Data imputation
- 11. Survey methodology
- 12. Important issues for data analysis
- 13. User support and training
- Appendix: LSAC variable naming conventions
Two types of data are available to data users:
- restricted release data
- general release data.
The only information not included is name, address and other contact details for the child, family, child care agency and teacher or carer. Access to the restricted release data may be granted where data users are able to demonstrate a genuine need for the additional data and they meet the necessary additional security requirements. It should be noted that from Wave 7 in-confidence data will be known as restricted release data.
In addition to the information removed for the restricted release data, some other items have also been removed, and some items have either been transformed, had response categories collapsed or have been top-coded (i.e. recoding outlying values to a less extreme value). These items have been flagged in the “Confidentialisation” column of the data dictionary. It is important for data users to be aware that these items are eligible for confidentialisation if required but not all items may require confidentialisation in a given wave.
Confidentialisation of general release data are detailed below.
The following items are removed:
- qualitative data provided by respondents
- census and postcode data for the location of carers and schools.
The following items are transformed:
- postcode - postcodes are given an indicator so that all children selected in the same postcode can be identified
- date left hospital after birth - number of days between birth and departure.
The following items have response categories collapsed (i.e. response categories combined to form an aggregate category):
- parents' occupation - output at two-digit Australian and New Zealand Standard Classification of Occupations (ANZSCO) level, or rounded off to the nearest five if ANU four ratings of occupational prestige
- occupation in previous job - output at two-digit ANZSCO level
- Socio-Economic Index for Areas (SEIFA) variables - rounded to the nearest 10
- country of birth (coded as 0 if fewer than five contributors)
- religion (coded as 0 if fewer than five contributors)
- language other than English (LOTE) (coded as 0 if fewer than five respondents).
The following data items are top-coded:
- housing costs
- child support paid by Parent 2
- children and parents' current height, weight and waist circumference
- number of hours spent in child care.
The following topics are suppressed in relation to Study Child pregnancy and offspring information:
LSAC assessed disclosure risk assessment of Study Child offspring information available in K cohort (less than 5 cases). Topics that were considered as highly vulnerable to exposure to privacy risk were family demographics, health behaviour and risk factors, health status, home education environment, offspring program characteristics, paid work, parenting, parent living elsewhere, relationships and social capital. This information is available in restricted data file whereas the information has been suppressed and is presented as missing in general release data file.