MATH30722 - 2006/2007
- Title: Generalised Linear Models
- Unit code: MATH30722
- Credits: 10
- Prerequisites: Probability and Statistics I (Year 1 Semester 1) Probability and Statistics II (Year 2 Semester 1), Knowledge of Linear Models (Year 3 Semester 1) is helpful but not essential
- Co-requisite units: None
- School responsible: Mathematics
- Member of staff responsible: Dr. Jianxin Pan
Specification
Aims
To study an important aspect of modern statistical modelling in an integrated way, and to develop the properties and uses of the GLM, focusing on those situations in which the response variable is discrete. To explore some of the wide range of real-life situations occurring in the fields of agriculture, biology, engineering, industrial experimentation, medical and social sciences that can be investigated using the GLM.
Brief Description of the unit
As an important modelling strategy Linear Statistical Model (MT30341) is concerned with investigating whether, and how, the so-called one or more explanatory variables, such as age, sex, blood pressure, etc., influence a response variable, such as a patient's diagnosis, by taking random variations of data into account. In Linear Statistical Model, linear regression and Normal distribution are assumed in order to explore the possible linear relation between a continuous response and one or more explanatory variables. In this course unit we depart from linearity and normality, the very strict limitation in Linear Statistical Models. We study the extension of linearity to non-linearity and normality to a commonly encountered distribution family, called exponential family of distributions. The extension forms Generalised Linear Models (GLM). The GLM, on the one hand, unifies linearand non-linear models in statistical modelling strategy. On the other hand, it can be used toanalyze discrete data, including binary, binomial, counted and categorical data, arising very often in biomedical, economic and social sciences and also industrial applications.
Learning Outcomes
On successful completion of the course unit students will have a good understanding of:
- principles and methods of statistical modelling for the GLM: response and explanatory variables, exponential family of distributions, maximum likelihood estimation, confidence interval and hypothesis testing, goodness of fit, etc.
- the use of the computer statistical software R or S-PLUS, which is available on the Maths PC Cluster and does not require any previous programming experience.
- statistical analysis of both continuous and discrete data arising in practice by using the statistical software R or S-PLUS.
Future topics requiring this course unit
This course unit is naturally related to some 4th year courses on statistical modelling, e.g., Longitudinal Data Analysis (MT40752) and Survival Analysis (MT40762).
Syllabus
- Introduction: background, review of linear models in matrix notation, model assessment, some pre-required knowledge. [2]
- Generalized linear models (GLM): exponential family of distributions, generalized linear models, maximum likelihood estimation, Newton-Raphson and Fisher scoring algorithms, goodness of fit, deviance, confidence interval, hypothesis testing, GLM fitting using R or S-PLUS. [10]
- Normal linear regression models: least squares, analysis of variance, factors, interactions between factors. [2]
- Binary and Binomial data analysis: distribution and models, logistic regression models, odds ratio, one- and two-way logistic regression analysis [5]
- Poisson count data analysis: Poisson regression models with offset, two-dimensional contingency tables, log-linear models [5]
- Textbooks
- Dobson, A. J. (2002). An Introduction to Generalized Linear Models. Chapman & Hall.
- Krzanowski, W. (1998). An Introduction to statistical Modelling. Arnold.
- McCullagh, P. and Nelder, J. A. (1990). Generalized Linear Models. Chapman & Hall
Teaching and learning methods
Two lectures per week plus one weekly examples class.
- Assessment
- Coursework 20%. End of semester examination (2 hours) 80%.
