The two main approaches to modelling correlated binary data are GEE and hierarchical models. The GEE approach does not readily extend to modelling spatially correlated binary data because typically there is no replication. In addition there are bounds on the correlations of binary variables that these two approaches do not account for.
Here we model spatially correlated binary variables by modelling the pairs of points in the variogram cloud. The non-stationary mean is adjusted for and an estimate of the variogram is obtained that is then used in a GEE step. Iteration then takes place over the two steps.
In an ad hoc approach, the residuals from a logistic regression on the binary variables are modelled by a linear geostatistical model with an exponential correlation structure. It can be argued that if there is no spatial correlation structure in the residuals then there is none in the original process. This approach allows comparison of different models via modified likelihood ratio tests.
The methods are illustrated on data from the Four Area Project that recorded incidence of TB in badgers and cattle in four areas in Ireland from 1997-2002. The question of whether there is spatial clustering of the disease in badgers, in cattle and cross species is of primary concern.
This is work in progress.