**7. Appendix: Technical details**

#### **7.1 Model fitting and software**

Here we describe methods for fitting our hierarchical model using the Markov chain Monte Carlo (MCMC) algorithms implemented in the software package, JAGS (Just Another Gibbs Sampler), which is freely available at the following web site: http://mcmc-jags. sourceforge.net. This software allows the user to specify a model in terms of its underlying assumptions, which include the distributions assumed for the observed data and the model's parameters. The latter distributions include priors, which are needed, of course,

For example, Gelman et al. (2008) recommended a t-distribution with *σ* = 2.5 and *ν* = 1 as a "robust" alternative to a t-family approximation of Jeffreys' prior (*σ* = 2.5 and *ν* = 7). However, when the logit-scale parameter (say, *θ*) is transformed to the probability scale (*p* = 1/(1 + exp(−*θ*))), both of these priors assign high probabilities in the vicinity of *p* = 0 and *p* = 1, which is not always desirable. As an alternative, we used a t-distribution with *σ* = 1.566 and *ν* = 7.763 as a prior for each logit-scale parameter of our model. This distribution approximates a Uniform(0, 1) prior for *p* and assigns low probabilities to values

Modern Methods of Estimating Biodiversity from Presence-Absence Surveys 299

Given our choice of priors and the amount of information in the ant data, parameter estimates based on a single model are unlikely to be sensitive to the priors used in our analysis. However, it is well known that the distributional form of a noninformative prior can exert considerable influence on posterior model probabilities (Kadane & Lazar 2004, Kass & Raftery 1995). Because these probabilities are used to select a single model for inference, we examined the sensitivity of the model probabilities to our choice of priors. In particular, we considered a t-family approximation of Jeffreys' prior (*σ* = 2.482 and *ν* = 5.100) as an alternative for the logit-scale parameters of our model. As described earlier, Jeffreys' prior is commonly used in

**AntDetections1999.csv** – species- and site-specific capture frequencies of ants in bog

**GetDetectionMatrix.R** – R code for reading capture frequencies of ants from data file and returning a species- and site-specific matrix of capture frequencies of ants collected in

**MultiSpeciesOccModelAve.R** – R and JAGS code for defining and fitting the hierarchical

**SiteCovariates.csv** – site-specific values of covariates (format is comma-delimited with

Anderson, M. J., Crist, T. O., Chase, J. M., Vellend, M., Inouye, B. D., Freestone, A. L., Sanders,

Boulinier, T., Nichols, J. D., Sauer, J. R., Hines, J. E. & Pollock, K. H. (1998). Estimating

Brooks, S. P. & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations, *Journal of Computational and Graphical Statistics* 7: 434–455. Colwell, R. K., Mao, C. X. & Chang, J. (2004). Interpolating, extrapolating, and comparing incidence-based species accumulation curves, *Ecology* 85: 2717–2727.

roadmap for the practicing ecologist, *Ecology Letters* 14: 19–28.

N. J., Cornell, H. V., Comitka, L. S., Davies, K. F., Harrison, S. P., Kraft, N. J. B., Stegen, J. C. & Swenson, N. G. (2011). Navigating the multiple meanings of *β* diversity: a

species richness: the importance of heterogeneity in species detectability, *Ecology*

The following files were used to fit our hierarchical model to the ant data sets.

and forest habitats (format is comma-delimited with first row as header)

**GetSiteCovariates.R** – R code for reading covariates from data file

outside the interval (-5,5).

**7.3 Data files and source code**

model

**8. References**

first row as header)

79: 1018–1028.

Bayesian analyses of logistic-regression models.

a specified habitat ('Forest' or 'Bog')

to conduct a Bayesian analysis of the data (see below). Part of the reason for the popularity of JAGS is that it allows the model to be specified and fitted without requiring the user to derive the MCMC sampling algorithms used in computing the joint posterior. That said, naive use of JAGS may yield undesirable results, and some experience is needed to ensure the accuracy of the results.

We prefer to execute JAGS remotely from R (R Development Core Team 2004) using functions defined in the R package RJAGS (http://mcmc-jags.sourceforge.net). In this way R is used to organize the data, to provide inputs to JAGS, and to receive outputs (results) from JAGS. However, the model's distributional assumptions must be specified in the native language of JAGS. The data files and source code needed to fit our model are provided below. In our analysis of each data set, the posterior was calculated by initializing each of 5 Markov chains independently and running each chain for a total of 250,000 draws. The first 50,000 draws of each chain were discarded as "burn-in", and every 50th draw in the remainder of each chain was retained to form the posterior sample. Based on Gelman-Rubin diagnostics of the model's parameters (Brooks & Gelman 1998), this approach appeared to produce Markov chains that had converged to their stationary distribution. Therefore, we used the posterior sample of 20,000 draws to compute estimates of the model's parameters and 95% credible intervals.

#### **7.2 Prior distributions**

Our prior distributions were chosen to specify prior indifference in the magnitude of each parameter. For example, we assumed a Uniform(0,1) prior for Ω, the probability that a species in the augmented data set is a member of the *N* species vulnerable to capture. It is easily shown that this prior induces a discrete uniform prior on *N*, which assigns equal probability to each integer in the set {0, 1, . . . , *M*}. We also used the uniform distribution for the correlation parameter *ρ*; specifically, we assumed a Uniform(-1,1) prior for *ρ*, thereby favoring no particular value of *ρ* in the analysis.

Each of the heterogeneity parameters (*σa*<sup>0</sup> , *σb*<sup>0</sup> , *σbl* ) was assigned a half-Cauchy prior (Gelman 2006) with unit scale parameter, which has probability density function

$$f(\sigma) = 2/[\pi(1+\sigma^2)].$$

Gelman (2006) showed that this prior avoids problems that can occur when alternative "noninformative" priors are used (including the nearly improper, Inverse-Gamma(*�*, *�*) family).

Currently, there is no consenus choice of noninformative prior for the logit-scale parameters of logistic-regression models (Gelman et al. 2008, Marin & Robert 2007). To specify a prior for the logit-scale parameters of our model (*α*0, *β*0, *βl*), we used an approach described by Gelman et al. (2008). Recall that the covariates of our model are centered and scaled to have mean zero and unit variance; therefore, we seek a prior that assigns low probabilities to large effects on the logit scale. The reason for this choice is that a difference of 5 on the logit scale corresponds to a difference of nearly 0.5 on the probability scale. Because shifts in the value of a standardized covariate seldom, in practice, correspond to outcome probabilities that change from 0.01 to 0.99, the prior of a logit-scale parameter should assign low probabilities to values outside the interval (-5,5). The family of zero-centered t-distributions with parameters *σ* (scale) and *ν* (degrees of freedom) can be used to specify priors with this goal in mind. For example, Gelman et al. (2008) recommended a t-distribution with *σ* = 2.5 and *ν* = 1 as a "robust" alternative to a t-family approximation of Jeffreys' prior (*σ* = 2.5 and *ν* = 7). However, when the logit-scale parameter (say, *θ*) is transformed to the probability scale (*p* = 1/(1 + exp(−*θ*))), both of these priors assign high probabilities in the vicinity of *p* = 0 and *p* = 1, which is not always desirable. As an alternative, we used a t-distribution with *σ* = 1.566 and *ν* = 7.763 as a prior for each logit-scale parameter of our model. This distribution approximates a Uniform(0, 1) prior for *p* and assigns low probabilities to values outside the interval (-5,5).

Given our choice of priors and the amount of information in the ant data, parameter estimates based on a single model are unlikely to be sensitive to the priors used in our analysis. However, it is well known that the distributional form of a noninformative prior can exert considerable influence on posterior model probabilities (Kadane & Lazar 2004, Kass & Raftery 1995). Because these probabilities are used to select a single model for inference, we examined the sensitivity of the model probabilities to our choice of priors. In particular, we considered a t-family approximation of Jeffreys' prior (*σ* = 2.482 and *ν* = 5.100) as an alternative for the logit-scale parameters of our model. As described earlier, Jeffreys' prior is commonly used in Bayesian analyses of logistic-regression models.

#### **7.3 Data files and source code**

22 Will-be-set-by-IN-TECH

to conduct a Bayesian analysis of the data (see below). Part of the reason for the popularity of JAGS is that it allows the model to be specified and fitted without requiring the user to derive the MCMC sampling algorithms used in computing the joint posterior. That said, naive use of JAGS may yield undesirable results, and some experience is needed to ensure the accuracy

We prefer to execute JAGS remotely from R (R Development Core Team 2004) using functions defined in the R package RJAGS (http://mcmc-jags.sourceforge.net). In this way R is used to organize the data, to provide inputs to JAGS, and to receive outputs (results) from JAGS. However, the model's distributional assumptions must be specified in the native language of JAGS. The data files and source code needed to fit our model are provided below. In our analysis of each data set, the posterior was calculated by initializing each of 5 Markov chains independently and running each chain for a total of 250,000 draws. The first 50,000 draws of each chain were discarded as "burn-in", and every 50th draw in the remainder of each chain was retained to form the posterior sample. Based on Gelman-Rubin diagnostics of the model's parameters (Brooks & Gelman 1998), this approach appeared to produce Markov chains that had converged to their stationary distribution. Therefore, we used the posterior sample of 20,000 draws to compute estimates of the model's parameters and 95% credible

Our prior distributions were chosen to specify prior indifference in the magnitude of each parameter. For example, we assumed a Uniform(0,1) prior for Ω, the probability that a species in the augmented data set is a member of the *N* species vulnerable to capture. It is easily shown that this prior induces a discrete uniform prior on *N*, which assigns equal probability to each integer in the set {0, 1, . . . , *M*}. We also used the uniform distribution for the correlation parameter *ρ*; specifically, we assumed a Uniform(-1,1) prior for *ρ*, thereby

*f*(*σ*) = 2/[*π*(1 + *σ*2)].

Gelman (2006) showed that this prior avoids problems that can occur when alternative "noninformative" priors are used (including the nearly improper, Inverse-Gamma(*�*, *�*)

Currently, there is no consenus choice of noninformative prior for the logit-scale parameters of logistic-regression models (Gelman et al. 2008, Marin & Robert 2007). To specify a prior for the logit-scale parameters of our model (*α*0, *β*0, *βl*), we used an approach described by Gelman et al. (2008). Recall that the covariates of our model are centered and scaled to have mean zero and unit variance; therefore, we seek a prior that assigns low probabilities to large effects on the logit scale. The reason for this choice is that a difference of 5 on the logit scale corresponds to a difference of nearly 0.5 on the probability scale. Because shifts in the value of a standardized covariate seldom, in practice, correspond to outcome probabilities that change from 0.01 to 0.99, the prior of a logit-scale parameter should assign low probabilities to values outside the interval (-5,5). The family of zero-centered t-distributions with parameters *σ* (scale) and *ν* (degrees of freedom) can be used to specify priors with this goal in mind.

) was assigned a half-Cauchy prior (Gelman

of the results.

intervals.

family).

**7.2 Prior distributions**

favoring no particular value of *ρ* in the analysis. Each of the heterogeneity parameters (*σa*<sup>0</sup> , *σb*<sup>0</sup> , *σbl*

2006) with unit scale parameter, which has probability density function

The following files were used to fit our hierarchical model to the ant data sets.


#### **8. References**


*in marked populations, series: environmental and ecological statistics, volume 3*, Springer,

Holt, R. D., Shurin, J. B., Law, R., Tilman, D., Loreau, M. & Gonzalez, A. (2004). The metacommunity concept: a framework for multi-scale community ecology, *Ecology*

patterns when species are detected imperfectly, *Journal of Animal Ecology* 73: 546–555.

investigation of observation error in anuran call surveys, *Journal of Wildlife*

error induces bias when inferring patterns and dynamics of species occurrence via

radiation regimes in a tropical wet forest using quantum sensors and hemispherical

Kéry, M., Royle, J. A., Plattner, M. & Dorazio, R. M. (2009). Species richness and occupancy estimation in communities subject to temporary emigration, *Ecology* 90: 1279–1290.

Modern Methods of Estimating Biodiversity from Presence-Absence Surveys 301

MacKenzie, D. I., Bailey, L. L. & Nichols, J. D. (2004). Investigating species co-occurrence

McClintock, B. T., Bailey, L. L., Pollock, K. H. & Simon, T. R. (2010a). Experimental

McClintock, B. T., Bailey, L. L., Pollock, K. H. & Simon, T. R. (2010b). Unmodeled observation

R Development Core Team (2004). *R: A language and environment for statistical computing*, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

Rich, P. M., Clark, D. B., Clark, D. A. & Oberbauer, S. F. (1993). Long-term study of solar

Robert, C. P. & Casella, G. (2004). *Monte Carlo Statistical Methods (second edition)*,

Royle, J. A. (2006). Site occupancy models with heterogeneous detection probabilities,

Royle, J. A. & Dorazio, R. M. (2008). *Hierarchical modeling and inference in ecology*, Academic

Royle, J. A. & Dorazio, R. M. (2011). Parameter-expanded data augmentation for Bayesian analysis of capture-recapture models, *Journal of Ornithology* 123: in press. Royle, J. A. & Link, W. A. (2006). Generalized site occupancy models allowing for false positive

Russell, R. E., Royle, J. A., Saab, V. A., Lehmkuhl, J. F., Block, W. M. & Sauer, J. R. (2009).

Simons, T. R., Alldredge, M. W., Pollock, K. H. & Wettroth, J. M. (2007). Experimental analysis of the auditory detection process on avian point counts, *Auk* 124: 986–999. Umphrey, G. (1996). Morphometric discrimination among sibling species in the *fulva* - *rudis* -

responses to prescribed fire, *Ecological Applications* 19: 1253–1263.

Modeling the effects of environmental disturbance on wildlife communities: avian

*texana* complex of the ant genus *Aphaenogaster* (Hymenoptera: Formicidae), *Canadian*

photography, *Agricultural and Forest Meteorology* 65: 107–127.

Magurran, A. E. (2004). *Measuring biological diversity*, Blackwell, Oxford. Marin, J.-M. & Robert, C. P. (2007). *Bayesian Core*, Springer, New York.

Kuo, L. & Mallick, B. (1998). Variable selection for regression models, *Sankhya* 60B: 65–81. Leibold, M. A., Holyoak, M., Mouquet, N., Amarasekare, P., Chase, J. M., Hoopes, M. F.,

Berlin, pp. 639–656.

*Letters* 7: 601–613.

*Management* 74: 1882–1893.

URL: *http://www.R-project.org*

Springer-Verlag, New York.

*Journal of Zoology* 74: 528–559.

*Biometrics* 62: 97–102.

Press, Amsterdam.

aural detections, *Ecology* 91: 2446–2454.

and false negative errors, *Ecology* 87: 835–841.


24 Will-be-set-by-IN-TECH

Dorazio, R. M. (2007). On the choice of statistical models for estimating occurrence and

Dorazio, R. M., Kéry, M., Royle, J. A. & Plattner, M. (2010). Models for inference in dynamic

Dorazio, R. M. & Royle, J. A. (2005). Estimating size and composition of biological

Dorazio, R. M., Royle, J. A., Söderström, B. & Glimskär, A. (2006). Estimating species

Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion),

Ellison, A. M., Gotelli, N. J., Alpert, G. D. & Farnsworth, E. J. (2012). *A field guide to the ants of*

Francoeur, A. (1997). Ants (Hymenoptera: Formicidae) of the Yukon, *in* H. V. Danks &

Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (Comment on article by Browne and Draper), *Bayesian Analysis* 1: 515–534. Gelman, A., Carlin, J. B., Stern, H. S. & Rubin, D. B. (2004). *Bayesian data analysis, second edition*,

Gelman, A., Jakulin, A., Pittau, M. G. & Su, Y.-S. (2008). A weakly informative default

Genet, K. S. & Sargent, L. G. (2003). Evaluation of methods and data quality from a volunteer-based amphibian call survey, *Wildlife Society Bulletin* 31: 703–714. Gotelli, N. J. (2000). Null model analysis of species co-occurrence patterns, *Ecology*

Gotelli, N. J. & Ellison, A. M. (2002). Biogeography at a regional scale: determinants of ant species density in New England bogs and forests, *Ecology* 83: 1604–1609. Gotelli, N. J., Ellison, A. M., Dunn, R. R. & Sanders, N. J. (2011). Counting ants (Hymenoptera:

Holyoak, M. & Mata, T. M. (2008). Metacommunities, *in* S. E. Jorgensen & B. D. Fath (eds),

Kadane, J. B. & Lazar, N. A. (2004). Methods and criteria for model selection, *Journal of the*

Kass, R. E. & Raftery, A. E. (1995). Bayes factors, *Journal of the American Statistical Association*

Kéry, M., Dorazio, R. M., Soldaat, L., van Strien, A., Zuiderwijk, A. & Royle, J. A. (2009).

Kéry, M. & Royle, J. A. (2009). Inference about species richness and community structure using

Trend estimation in populations with imperfect detection, *Journal of Applied Ecology*

species-specific occupancy models in the national Swiss breeding bird survey MHB, *in* D. L. Thomson, E. G. Cooch & M. J. Conroy (eds), *Modeling demographic processes*

*Encyclopedia of Ecology*, Academic Press, Oxford, pp. 2313–2318.

communities by modeling the occurrence of species, *Journal of the American Statistical*

richness and accumulation by modeling species occurrence and detectability, *Ecology*

J. A. Downes (eds), *Insects of the Yukon*, Survey of Canada (Terrestrial Arthropods),

prior distribution for logistic and other regression models, *Annals of Applied Statistics*

Formicidae): biodiversity sampling and statistical analysis for myrmecologists,

extinction from animal surveys, *Ecology* 88: 2773–2782.

*Journal of the Royal Statistical Society, Series B* 57: 45–97.

*New England*, Yale University Press, New Haven, Connecticut.

metacommunity systems, *Ecology* 91: 2466–2475.

*Association* 100: 389–398.

Ottawa, Ontario, pp. 901–910.

Chapman and Hall, Boca Raton.

*Myrmecological News* 15: 13–19.

*American Statistical Association* 99: 279–290.

87: 842–854.

2: 1360–1383.

81: 2606–2621.

90: 773–795.

46: 1163–1172.

*in marked populations, series: environmental and ecological statistics, volume 3*, Springer, Berlin, pp. 639–656.


**14** 

*Côte d'Ivoire* 

**Isolation and Identification of Indigenous** 

**Microorganisms of Cocoa Farms in Côte** 

**d'Ivoire and Assessment of Their** 

*Phytophthora palmivora***, the Causal** 

Joseph Mpika, Ismaël B. Kebe and François K. N'Guessan

The black pod disease due to *Phytophthora* spp is a destructive disease of cocoa. Worldwide, yield losses have been estimated to 30% (Lass, 1985). Côte d'Ivoire, the first cocoa producing country in the world, with 44% of the world market (ICCO, 2000) is also concerned by this disease. Several species of *Phytophthora* are involved in the disease. In Africa, two species, *P. palmivora* and *P. megakarya*, are the most damaging. The first species, which is the most common, causes damage in all the cocoa producing countries in the world, with yield losses between 20 to 30% ; the second, endemic to central and west Africa, is the most aggressive. This pathogen may cause the loss of the whole pod production in some countries (Flood, 2006). In Côte d'Ivoire, since the discovery of *P. megakarya* in the western region in the 90s, the black pod disease problem became more serious (Koné, 1999; Kouamé, 2006). Yield losses increased from an average of 10% to 35-40% (Kébé *et al*., 1996). Thus, the control of the

Although chemical control was developed by the research scientists, the dissemination of this method to the farmers was little successful. The low level of adoption of this technology by the farmers could be explained by the high cost of the fungicides as well as the difficulties related to the provision of water and the application of the fungicides. In addition, the requirements of the international market in terms of bean quality, environmental constrains, health issues for the consumers, and the different moratoriums in this area from the market partners (Anonyme, 2006), are numbers of constraints that do not

The strategy adopted in Côte d'Ivoire to control the black pod disease is based on integrated management, which is cost effective and environmentally friendly. This approach combines the use of agronomic practices, resistant cocoa varieties and natural antagonistic microorganisms of *Phytophthora*. Research works continue in order to improve agronomic practices and varietal resistance. The use of natural antagonistic microorganisms of

**1. Introduction** 

disease became also a priority.

facilitate the development of the chemical control method.

**Antagonistic Effects Vis-À-Vis** 

**Agent of the Black Pod Disease** 

*Laboratoire de Phytopathologie, CNRA, Divo,* 

