From lwaller@sph.emory.edu Mon Feb 19 10:43:14 2001 Received: from mxu4.u.washington.edu (mxu4.u.washington.edu [140.142.33.8]) by lists.u.washington.edu (8.9.3+UW00.05/8.9.3+UW00.12) with ESMTP id KAA92224 for ; Mon, 19 Feb 2001 10:43:13 -0800 Received: from gator.sph.emory.edu (root@gator.sph.emory.edu [170.140.4.2]) by mxu4.u.washington.edu (8.9.3+UW00.02/8.9.3+UW99.09) with ESMTP id KAA32052 for ; Mon, 19 Feb 2001 10:43:13 -0800 Received: from viper.sph.emory.edu (root@viper.sph.emory.edu [170.140.4.1]) by gator.sph.emory.edu (8.11.2/8.11.2) with ESMTP id f1JIhAh27349 for ; Mon, 19 Feb 2001 13:43:10 -0500 (EST) Received: from sph.emory.edu (squid.sph.emory.edu [170.140.4.9]) by viper.sph.emory.edu (8.11.2/8.11.2) with ESMTP id f1JIh7E19817; Mon, 19 Feb 2001 13:43:07 -0500 (EST) Sender: lwaller@sph.emory.edu Message-ID: <3A91693B.C19F7CEE@sph.emory.edu> Date: Mon, 19 Feb 2001 13:43:07 -0500 From: Lance Waller X-Mailer: Mozilla 4.76 [en] (X11; U; SunOS 5.8 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: waphgis@u.washington.edu, lwaller Subject: Re: testing spatial association References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dear Alberto (and any other interested WAPGHISers): Here some comments and references on the asthma map. As clarified in your second message, the question of interest here is not so much finding clusters than it is finding whether areas of high incidence correspond to areas of high exposure. Procedures like SaTScan find clusters but (as far as I know) do not test for the associations you are interested in. I was involved in a similar study in San Diego, published in English, P., Neutra, R., Scalf, R., Sullivan, M., Waller, L. and Zhu, L. (1999) Examining associations between childhood asthma and traffic flow using a geographic information system. Environmental Health Perspectives. 107, 761-767. We didn't use particularly fancy spatial statistics, this was a case-control study. In the paper, the authors report odds ratios using Gaussian dispersion models for exposures (2 different models), as well as distance from residence to closest street and to the street with the highest traffic counts. This report may provide some ideas for your analysis. As for particular statistical methods, you could consider the score test proposed in Waller, L.A., Turnbull, B.W., Clark, L.C., and Nasca, P. (1994). ``Spatial pattern analyses to detect rare disease clusters''. In Case Studies in Biometry, N. Lange, L. Ryan, L. Billard, D. Brillinger, L. Conquest, and J. Greenhouse, eds. John Wiley and Sons, New York. pp. 3-23. Lawson, A.B. (1993). On the analysis of mortality events associated with a prespecified fixed point. Journal of the Royal Statistical Society, Series A, 156, 363-377. Waller, L.A., Turnbull, B.W., Clark, L.C., and Nasca, P. (1992). ``Chronic disease surveillance and testing of clustering of disease and exposure: application to leukemia incidence and TCE-contaminated dumpsites in upstate New York''. Environmetrics. 3, 281-300. Here's a quick description and how-to: Data you need: Disease counts for a set of administrative areas (cases_i for the ith region) Population size for the same set of areas (pop_i) Average exposure per individual for each area (exposure_i) Null hypothesis: All individuals have same incidence rate regardless of location. Let's call this rate lambda. Alternative hypothesis: Incidence rate increases multiplicatively with exposure. Test statistic: summation of exposure_i*(cases_i - lambda*pop_i) over all areas. To get p-value: If you standardize the test statistic by dividing by summation of (exposure_i*exposure_i)*lambda*pop_i over all areas, the standardized statistic follows a Normal(0,1) distribution. (This assumes that you know lambda. If you estimate it by (total number of cases)/(total number at risk) you need to make a slight adjustment in the standardization). This test has some nice power properties (locally most powerful versus the multiplicative alternative hypothesis). We didn't use it in the San Diego study but it is pretty easy to calculate so you might want to try it out before using Bayes/Empirical Bayes. Bayes/Empirical Bayes can do spatial smoothing of rates and include a covariate, but many applications just use it for smoothing and do not consider particular covariates (like exposure). For a nice discussion of Bayes/Empirical Bayes methods see the chapter by Clayton and Bernardinelli in the 1992 Elliott et al book referenced below. While the references above define the test in terms of exposure to fixed points (e.g. pollution sources), the score test is actually simply a test for trend in Poisson random variables (decreasing rates with decreasing exposure values). Typically in GIS-based studies the exposure proxies have some sort of spatial flavor (e.g. decreasing exposure with decreasing distance), but this need not be the case. In the Waller et al references, we used inverse distance as a proxy for exposure but you could do a better job using some of Dick Hoskins's suggestions in this case. Jim Case's suggestion of dividing into "exposed" and "nonexposed" groups reduces the test to a sum of "observed - expected" in the exposed areas only. For more comments along those of Jim Case's regarding null and alternative hypotheses for spatial disease clustering, you may want to look over a paper Geoff Jacquez and I did a few years ago: Waller, L.A. and Jacquez, G.M. (1995). ``Disease models implicit in statistical tests of disease clustering''. Epidemiology. 6, 584-590. Finally, some review papers on statistical methods for such applications are: Elliott, P., Cuzick, J., English, D., and Stern (1992). Geographical and Environmental Epidemiology: Methods for Small-Area Studies. Oxford University Press. Elliott, P., Wakefield, J.C., Best, N.G., and Briggs, D.J. (2000). Spatial Epidemiology: Methods and Applications. Oxford University Press. Elliott, P., Martuzzi, M., and Shaddick, G. (1995). Spatial statistical methods in environmental epidemiology: a critique. Statistical Methods in Medical Research, 4, 137-159. Alexander, F.E., and Boyle, P. (eds) (1996). Methods for Investigating Localized Clustering of Disease. IARC Scientific Publications No.\ 135. International Agency for Research on Cancer, Lyon, France. Lawson, A.B. and Waller, L.A. (1996). A review of point pattern methods for spatial modelling of events around sources of pollution. Environmetrics, 7, 471-488. Marshall, R.J. (1991). A review of methods for the statistical analysis of spatial patterns of disease. Journal of the Royal Statistical Society, Series A, 154, 421-441. I hope you find this information helpful. Lance Waller Health Maps wrote: > > The problem you will have with this study is that although many cases may > live near the big roads, a lot of people will live near little roads that > may also or may even be more likely to be adding to a respiratory disease > process. > > More people tend to live near smaller roads than the large highways, and if > the traffic density is high enough, their exposure may be greater. This is > because there are more smaller roads and likely because of highway > right-of-way rules people are allowed to live nearer non-major high traffic > highways than big highways. For example, in the US although Interstates may > have very high traffic densities, and although many people may live near > those roads in urban areas, they are farther from the center line of the > highways than those people that live on lesser roads which may or may not > have even higher traffic densities. The diffusion process is inverse to the > square root of the molecular weight of the pollutant and exponential, that > is, ~ 1/distance. How far one lives from the road may be very important. > > This is not exactly a problem of confounding, but of mixed effect. > Proximity to any road, not just the big ones needs to be separated out or > delineated in some way. The confounding comes in if there is a socioeconomic > difference in who lives near a big highway and who does not. > > If you do not know other exposures that can lead to respiratory diseases, > then you may have a very big problem. For example do you know the smoking > status of the cases? And would you know the smoking status of any control > group you choose? > > The confounding part comes in because different socioeconomic classes tend > to have different smoking rates, and different socioeconomic classes tend to > have a differential probability of living near large roads - in my area. > > Richard Hoskins > healthmaps@home.com > GMT -8 > > To subscribe to WAPHGIS, Washington Public Health GIS listserve, send a > message to listproc@u.washington.edu > with the request "subscribe waphgis" followed by your name in the body of > the message, like so: > subscribe waphgis Your Name > > -----Original Message----- > From: WAPHGIS-owner@u.washington.edu > [mailto:WAPHGIS-owner@u.washington.edu]On Behalf Of > AZucchi@asl.bergamo.it > Sent: Monday, February 19, 2001 1:11 AM > To: waphgis@u.washington.edu > Subject: Rif: Re: testing spatial association > > Dear colleagues, > first of all thanks to all who kindly replied to my question. > I I become aware of having been somewhat naive in putting the question and > showing the cases map, it was just to exemplify; > I obviously have all single administrative in-boundaries prevalence rates, > and the map looks substantially the same. > The boundaries are representing the administrative city or village > boundaries; from north to south the highest distance is 40 kilometers. > I was not interested in searching clusters (as with SatScan) tout court, > but, as Dick Hoskins correctly defined, to look for > clusters/aggregation/association > along the roads (not radially, but along a linear -or, almost linear- > feature). Professor Jim Case suggests to simply divide the case population > into two classes: near a road- not near a road (after a sound definiton of > the two, of the distance, and so on), and then to test the differences. > Very correct is the suggestion to adopt empirical Bayes estimates to adjust > the small numbers. I will do it. > Could you suggest some similar published study to use as an example to > follow? > > Thank you > > Alberto Zucchi .