MATH38152 - 2010/2011
- Title: Social Statistics
- Unit code: MATH38152
- Credits: 10
- Prerequisites: MATH20701
- Co-requisite units: None
- School responsible: Mathematics
- Members of staff responsible: Professor Ian Plewis, Dr. Mark Tranmer (provisional); the expectation is that the course will be team taught by staff from the Social Statistics DA in the School of Social Sciences.
The aims of this course unit are to increase students’ awareness of the ways in which statistical ideas are used in the social sciences and the challenges faced by social statisticians, with a particular focus on research design, data collection and data quality.
Brief Description of the unit
Observational data can be obtained in a number of ways: in cross-sectional surveys and in longitudinal investigations, from administrative sources and as part of policy evaluations. Probability sampling is an important component of the design of observational studies and the need to model different kinds of population heterogeneity drives many analyses. The course unit will be organized around four themes and will introduce students to:
- a range of probability sampling methods for social surveys,
- the most commonly used research designs in the social sciences,
- some of the problems induced by missing data in surveys,
- the challenges of drawing causal inferences from statistical models fitted to observational data.
On successful completion of this course unit, students will
- understand the strengths and weaknesses of different research designs;
- understand and be able to put into practice the principles of statistical sampling;
- understand the extent and limitations to the interpretation of estimates from multiple regression and other related statistical models applied to social science data.
Future topics requiring this course unit
Theme A: Statistical Sampling (10 hours)
The rudiments of sampling theory.
- Target and survey populations.
- Principles of probability sampling.
- Simple random and systematic sampling.
- Stratification (proportionate and disproportionate) and survey weights.
- Multi-stage sampling, effects of clustering, sampling probability proportional to size.
- Ratio estimation, domain and small area estimation.
Theme B: Research Design (4 hours)
Main sources of quantitative social science data: surveys, policy evaluations, administrative datasets.
- Cross-sectional and repeated cross-sectional designs.
- Longitudinal, cohort and panel designs.
- Rotating designs.
- Evaluation designs.
- Modes of data collection: face-to-face, mail, telephone, web.
Theme C: Missing Data (4 hours)
Departures from probability samples.
- Different kinds of non-response in cross-sectional studies.
- Attrition in longitudinal studies.
- Methods for preventing non-response in surveys.
- Calibration, post-stratification and weighting methods.
- An introduction to imputation methods.
Theme D: Causal Inferences from Observational Data (4 hours)
Causal inferences from statistical models.
- The residual term in regression-type models.
- Unobserved heterogeneity and its effects on model parameters.
- Confounding, moderator and modifier variables.
- The advantages of longitudinal data.
- An introduction to propensity score matching.
Each of the topics covered in every theme will be illustrated with examples from social science and government research and some of the practical sessions will be based around a teaching dataset from one of the UK cohort studies.
- Kish, L. (1967) Survey Sampling New York: Wiley.
- Plewis, I. (1985) Analysing Change Chichester: Wiley.
- Rosenbaum, P. R. (2002) Observational Studies (2nd. Ed.) New York: Springer.
- Särndal, C.-E., Swensson, B. and Wretman, J. (1992) Model Assisted Survey Sampling. New York: Springer.
Teaching and learning methods
Two lectures and one practical/examples class each week. The practical sessions will use the statistical computing environment R. In addition, students should expect to spend at least four hours each week on private study for this course unit.
- Coursework: weighting 20%.
- End of semester examination: two hours, weighting 80%.