搜索此博客

2008年8月23日星期六

生态/环境研究的新挑战

Holling (1973)[1. Holling, C. S. Resilience and stability of ecological systems. Annual Review of Ecology and Systematics, 4:1–23, 1973.] 这样定义生态系统的 resilience:系统在发生重构和功能变化之前能够接受的最大扰动。而 Pimm (1984)[2. Pimm, S. L. The complexity and stability of ecosystems. Nature, 307:321–326, 1984. ] 在十年之后才提出 resilience 的另一个定义:系统受到扰动后回到初始状态所需的时间。虽然 Pimm 的定义提出较晚,但却得到了广泛得多的应用。而对于 Holling 的定义,虽然研究者往往会提及,却很少有人将其应用到实际研究中,以致到现在这个概念的研究者和实践者似乎已经成了一个不为人注意的学派。为什么 Holling 的想法应者如此寥寥?

Resilience and the Behavior of Large-Scale Systems(以下 RBLSS)这本书可以看作是这个久被忽视的学派争取学界注意的又一次努力,书中文章的主要作者包括 L. H. Gunderson,C. S. Holling,G. D. Peterson,B. H. Walker,S. R. Carpenter 和 D. Ludwig 等人,事实上他们就是仅有的几位我们通过一般的期刊搜索能比较容易地找到的,主张利用 Holling 的 resilience 定义研究社会-生态复合系统的学者。在书中他们重申了自己的想法,即 Pimm 的 resilience 的前提是不成立的,复杂系统并没有一个唯一、恒定、稳固的均衡(equilibrium)状态,相反系统可能有多个均衡状态,外力可以导致系统在这些状态间转换。在一个均衡状态附近(称为均衡域或吸引域),系统的变化是连续而接近线性的,可以用方程描述,但一旦变化超过了一定范围,系统的转变就是突然、非线性的。定义或者评价一个系统的 resilience,就是要找出均衡域的范围和其它可能的均衡状态。

这样的说法比较抽象,所以 Holling 的追随者们使用盆中的小球这个概念来说明:小球相当于系统,放在一个盆(均衡域)当中,盆的宽度就是 resilience 的强弱程度,而小球一旦越出盆沿,就会落入另一个盆,即一个性质、结构、功能都可能完全不同的系统状态。在数学上,他们则利用二分点这个概念来说明系统是如何离开均衡域的:类似我们熟悉的量变引起质变的思想,系统某些变量的取值会使类似下面方程的性质在这个取值附近发生巨大的改变:

$$\frac{dx}{dt} = -x(x^{2} - \alpha)$$



这类变量称为系统中的“慢变量”,变化往往较慢,却对系统的性质和功能有根本性的影响。到这里我们已经可以看到 Holling 的定义固然比 Pimm 的定义更能把握生态系统的非均衡本质,同时也避免了过度的还原论思想,但其实践比起 Pimm 的仅仅通过经验性的观察就可以测量的方法要困难许多,要求研究者对于系统的全面性质和各种内外影响因素都有较全面的了解,这在大的生态和社会经济系统中是难以做到的,难怪许多人望而却步。

也许正是因为这一点,RBLSS 这本脱胎于 workshop 的书将重点放在对大型系统的具体案例研究上。其研究的系统包括了主要的陆地生态系统如森林、湿地和草原,以及水生生态系统如湖泊和珊瑚礁群,研究方法也是多种多样,包括情境分析,数学建模,实地观测和地理信息科学等等。这些案例确实体现了 resilience 这个概念的普适性,但也暴露了它的一个重大不足,就是对于不同的生态系统缺乏一个一致的方法体系。即使在同类的生态系统中,由于物种和扰动因素的不同,研究者也往往不能套用书中给出的分析结果,而要自己从概念模型开始分析。在整个体系中缺乏“可重用”的部分,无疑也限制了这一概念像环境影响评价那样成为工业和学界中通行的做法。

当然,仅从框架来说,这本书也提供了一个可供套用的模式,包括概念模型的建立,跨比例过程和慢变量的识别等等,可供有意利用这个概念进行研究的人参考。但我认为这本书更大的意义在于指出了一条有可能让生态学研究从目前的还原主义研究方法飞跃到真正综合性的,把“人”这一因素纳入研究范围的研究的途径,向传统的研究方法提出了挑战。同时,相对于基于 Pimm 概念的,以某一特定状态为目标、要求保持和固守的环境管理,Holling 的 resilience 概念使得他能够提出新的生态/环境管理思想,即适应性的管理,承认生态和社会系统的变化和非均衡性,这比 resilience 这个概念本身有更大的实践意义,已经在一些具体的环境管理方案中得到应用。在 resilience 的评估当中,又要求研究者对现有的生态模型进行检验,并研究和归纳具体生态系统当中的各种主体和相互作用,这在一定程度上也促进了传统生态学理论和方法论的发展。从这个意义上说,这本书和 resilience 概念本身,对于生态和环境科学的研究者来说具有超越直接价值的意义。

2008年8月19日星期二

How will statistics go spatial

摘要:讨论空间统计的元素和面临的困难以及可能的发展方向。

When I consider the organisation of my thesis I want to boast that my study has used free software GIS -- Grass GIS as well as free statistics software package/language R to achieve spatial statistics capacity where even many commercial software packages are handicapped, as the two champions in the free software world can cooperate almost seamlessly, processing map and data back and forth and providing convicing statistics on a map.  However after a second thought I realised that "statistics on a map" is not the equivalent of "spatial statistics".

Although Geographic Information Systems were designed and built with statistics in mind, geographical data in its early days only were included as a context for other data such as census or measurement.  From an early GIS-generated map we can obtain information such as the total number of cars in UK and their distribution in various cities whose population is larger than x.  This is of course useful information but is insufficient now that people have more demands on geographical data itself, partly thanks to Google Maps/Earth and all those mash-up applications.  Therefore the statistics ability of GIS needs to be enhanced, allowing it to be able to process geographical data both as an explanatory variable and a dependent variable, instead of just as a context.

By geographical data I mean the following attributes:
























  • Vicinity




  • Direction




  • Gradient




  • Visibility




  • Fragmentation




  • Shape




  • Spatial correlation




  • Accessibility




  • Anomalies




  • Agents and behaviour



This is, of course, just a non-exhaustive list.  At present most GIS can only process a few of these attributes at a not very satistying level.  To enhance the ability of GIS, there are some hurdles that have to be overcome first.  The biggest one may be the problem of describing.  The nature of GIS and its supporting technology determine that GIS use mathematic-based, reductionist approaches to store and express data in a discrete way, however geography is continuous, with some concepts that are difficult to quantify.  How do we state "the habitat is located around the intersection of a footpath and a main highway bridge near Baxton, with a thinner canopy in the northern part providing more favourable conditions for lower storey plants" to a computer?  A promising direction of development seems lies in semantics and cognition science.

Another problem is the model used.  In geography everything is linked, then how does one represent such link with mathematical models?  Besides, statistics in its traditional sense is the aggregation of measurements, which can bring unexpected results in geographical studies.  This is especially a prominent problem if social processes are involved.  At least, the simple aggregation based on descriptive statistics will have a more limited role.  New techniques such as fractal mathematics and trend analysis may have more contribution to the solution to the problem in the future.

The more developed spatial statistics perhaps also will put a demand on more sophisticated spatial data, although the improvement of methodology can alleviate such demand.  When statisticians and geographers look at more subtle aspects of natural and social interactions going on on the ground, they will need more data such as the heteogeneity within a "patch" and individual behaviour.  In return GIS fed on these data may provide some insights in design of public spaces, priority of conservation, and so on.

I decide to wait and try to contribute to the (r)evolution of new spatial statistics.  Before that takes place my thesis will be "using GIS technology and simple statistics with a spatial concern".

Many of the ideas in this entry comes from or are inspired by this slide by Michael F.Goodchild in University of California, Santa Barbara.

2008年8月11日星期一

Emergence, or merely a mistake?

When I conduct DFA on my NDVI data, I found some interesting behaviour of the result.  Apparently the system behaves differently in different spatial scales, which is reflected in the fluctuation-temporal scale correlation and the slope of $$\text{log}F(L)/\text{log}L$$ (i.e. $$\alpha$$).

The DFA result for the whole map is like the picture below, $$\alpha$$ is between 0.5 and 1, indicating a correlation between fluctuation and time.  But the pixel results are seemingly random (this is expected), and all have $$\alpha << 0.5$$ (this is quite unexpected), indicating anti-correlations.

DFA map result



After examined the scripts and data in question I believe the design of the experiment has no fault that I could find.  I have also considered the possibility that nonstationarity and trend may have effects on the result, but the NDVI data I used have already had seasonal trend eliminated, while nonstationarity is not likely to cause such dramatic change in $$\alpha$$.  The result, if not a technical failure, will suggest that the system's behaviour at 8km (single pixel) and regional scales are different.  At the fine-grained scale, the fluctuation reduces as temporal scale increases, i.e. a fluctuation is likely to cause smaller and smaller fluctuations in the distant future.  But at coarser scale the behaviour is inversed, a fluctuation may cause larger and larger fluctuations at long intervals.


It is possible that the system's behaviour at coarser scales is the result of relatively simpler interactions at finer scales, creating a phenomenon known as emergence.  But the question is, if it is really emergence, what is the critical scale that such behaviour shift takes place.  Besides, what I have now is at best correlation, are there any intrinsic mechanism that determine such emergence?  In ecology, has there been similar emergence taking place in the shift from community to biome scale?


The system I'm studying is a pastoral socio-ecosystem.  If there is really emergence like this, does it suggest that the system at the farm scale is resilient but as a whole is not [1. There are some resilience researchers noted the problem of emergence, such as


Allen, C.R., Gunderson, L. & Johnson, A.R., 2005. The Use of Discontinuities and Functional Groups to Assess Relative Resilience in Complex Systems. Ecosystems, 8(8), 958-966. and


Folke, C., 2006. Resilience: The emergence of a perspective for social-ecological systems analyses. Global Environmental Change, 16(3), 253-267.]?  This is a really intriguing question.  I just hope it is not raised from a false observation.

2008年8月1日星期五

Find underlying fluctuation using DFA

NDVI time series showing data from two sources


We often have to deal with NDVI time series like this.  What does it tell on earth?  Very little information can be identified from this figure and its subsets.  However, techniques like DFA allow us to dig some information on the fluctuation and its cause.


Detrended Fluctuation Analysis (DFA) was first used in DNA analysis to identify whether the purine and pyrimidine base's distubution is scalable, or fractal.  Later it has been used by many to examine whether a bounded time series has fractal characteristics and is self-similar.  First of all the time series has to be converted to a random walk series using equation:


$$X_{t} = \sum_{i=1}^{t}(x_{i}-\langle x_{i} \rangle)$$


Where $$\langle x_{i} \rangle$$ is a value within the range of the time series, or a variable that changes with the element in the time series according to certain rule.  This is the "detrending" step.  Once the series has been converted, one can split the whole series into a number of non-overlapping "windows", each with a equal number of samples or length $$n$$.


Least Square fit is done to each of the windows to get the local trend.  The root-mean-squared deviation from the trend is called fluctuation:


$$F(L) = \[\frac{1}{L}\sum_{i=1}^{L}(X_{i} - ai -b)^{2}\]^{\frac{1}{2}}$$


For different window size $$L$$ we can obtain different $$F(L)$$.  On a log$$F(L)$$ vs. log$$L$$ plot, we have a straight line whose slope is $$\alpha$$ if the random walk has self-affinity.  The values of $$\alpha$$ have different meanings.  $$\frac{1}{2} < \alpha < 1$$ indicates that the fluctuation is correlated with time; while $$\alpha$$ close to 0.5 indicates white noise [1. Detrended fluctuation analysis. (2007, October 31). In Wikipedia, The Free Encyclopedia. Retrieved 13:15, August 1, 2008, from http://en.wikipedia.org/w/index.php?title=Detrended_fluctuation_analysis&oldid=168313780].


This technique is further used to identify whether there is identifiable trend in the variation of NDVI throughout years.  Telesca et al. (2005)[2. Luciano Telesca, Rosa Lasaponara, Antonio Lanorte, 1/f* fluctuations in the time dynamics of Mediterranean forest ecosystems by using normalized difference vegetation index satellite data, Physica A: Statistical Mechanics and its Applications, Volume 361, Issue 2, 1 March 2006, Pages 699-706.] has utilised the technique and identified positive feedbacks and reduced adaptability in Italian forest ecosystem.  Note that NDVI has natural periodic fluctuation due to seasonality, before DFA is applied, one should first reduce this fluctuation from the data.


This technique may also be applied in my study as I am also looking at NDVI trends.  Some spatial analysis may be added by using the technique on single pixels and/or "windows".