When I consider the organisation of my thesis I want to boast that my study has used free software GIS -- Grass GIS as well as free statistics software package/language R to achieve spatial statistics capacity where even many commercial software packages are handicapped, as the two champions in the free software world can cooperate almost seamlessly, processing map and data back and forth and providing convicing statistics on a map. However after a second thought I realised that "statistics on a map" is not the equivalent of "spatial statistics".
Although Geographic Information Systems were designed and built with statistics in mind, geographical data in its early days only were included as a context for other data such as census or measurement. From an early GIS-generated map we can obtain information such as the total number of cars in UK and their distribution in various cities whose population is larger than x. This is of course useful information but is insufficient now that people have more demands on geographical data itself, partly thanks to Google Maps/Earth and all those mash-up applications. Therefore the statistics ability of GIS needs to be enhanced, allowing it to be able to process geographical data both as an explanatory variable and a dependent variable, instead of just as a context.
By geographical data I mean the following attributes:
|
|
|
|
|
|
|
|
|
|
This is, of course, just a non-exhaustive list. At present most GIS can only process a few of these attributes at a not very satistying level. To enhance the ability of GIS, there are some hurdles that have to be overcome first. The biggest one may be the problem of describing. The nature of GIS and its supporting technology determine that GIS use mathematic-based, reductionist approaches to store and express data in a discrete way, however geography is continuous, with some concepts that are difficult to quantify. How do we state "the habitat is located around the intersection of a footpath and a main highway bridge near Baxton, with a thinner canopy in the northern part providing more favourable conditions for lower storey plants" to a computer? A promising direction of development seems lies in semantics and cognition science.
Another problem is the model used. In geography everything is linked, then how does one represent such link with mathematical models? Besides, statistics in its traditional sense is the aggregation of measurements, which can bring unexpected results in geographical studies. This is especially a prominent problem if social processes are involved. At least, the simple aggregation based on descriptive statistics will have a more limited role. New techniques such as fractal mathematics and trend analysis may have more contribution to the solution to the problem in the future.
The more developed spatial statistics perhaps also will put a demand on more sophisticated spatial data, although the improvement of methodology can alleviate such demand. When statisticians and geographers look at more subtle aspects of natural and social interactions going on on the ground, they will need more data such as the heteogeneity within a "patch" and individual behaviour. In return GIS fed on these data may provide some insights in design of public spaces, priority of conservation, and so on.
I decide to wait and try to contribute to the (r)evolution of new spatial statistics. Before that takes place my thesis will be "using GIS technology and simple statistics with a spatial concern".
Many of the ideas in this entry comes from or are inspired by this slide by Michael F.Goodchild in University of California, Santa Barbara.
没有评论:
发表评论