Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Data integration for inference about spatial processes: A model-based approach to test and account for data inconsistency

  • Simone Tenan ,

    Contributed equally to this work with: Simone Tenan, Chris Sutherland

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    simone.tenan@muse.it

    Affiliation Vertebrate Zoology Section, MUSE - Museo delle Scienze, Corso del Lavoro e della Scienza 3, 38122 Trento, Italy

  • Paolo Pedrini,

    Roles Funding acquisition, Project administration, Resources

    Affiliation Vertebrate Zoology Section, MUSE - Museo delle Scienze, Corso del Lavoro e della Scienza 3, 38122 Trento, Italy

  • Natalia Bragalanti,

    Roles Data curation, Investigation, Resources

    Affiliations Vertebrate Zoology Section, MUSE - Museo delle Scienze, Corso del Lavoro e della Scienza 3, 38122 Trento, Italy, Provincia Autonoma di Trento, Servizio Foreste e Fauna, Via Trener 3, 38100 Trento, Italy

  • Claudio Groff,

    Roles Funding acquisition, Investigation, Project administration, Resources

    Affiliation Provincia Autonoma di Trento, Servizio Foreste e Fauna, Via Trener 3, 38100 Trento, Italy

  • Chris Sutherland

    Contributed equally to this work with: Simone Tenan, Chris Sutherland

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Environmental Conservation, University of Massachusetts, Amherst, MA, 01003, United States of America

Abstract

Recently-developed methods that integrate multiple data sources arising from the same ecological processes have typically utilized structured data from well-defined sampling protocols (e.g., capture-recapture and telemetry). Despite this new methodological focus, the value of opportunistic data for improving inference about spatial ecological processes is unclear and, perhaps more importantly, no procedures are available to formally test whether parameter estimates are consistent across data sources and whether they are suitable for integration. Using data collected on the reintroduced brown bear population in the Italian Alps, a population of conservation importance, we combined data from three sources: traditional spatial capture-recapture data, telemetry data, and opportunistic data. We developed a fully integrated spatial capture-recapture (SCR) model that included a model-based test for data consistency to first compare model estimates using different combinations of data, and then, by acknowledging data-type differences, evaluate parameter consistency. We demonstrate that opportunistic data lend itself naturally to integration within the SCR framework and highlight the value of opportunistic data for improving inference about space use and population size. This is particularly relevant in studies of rare or elusive species, where the number of spatial encounters is usually small and where additional observations are of high value. In addition, our results highlight the importance of testing and accounting for inconsistencies in spatial information from structured and unstructured data so as to avoid the risk of spurious or averaged estimates of space use and consequently, of population size. Our work supports the use of a single modeling framework to combine spatially-referenced data while also accounting for parameter consistency.

Introduction

Obtaining precise estimates of population density and space use can lead to a better understanding of the processes governing spatiotemporal ecological dynamics and, in turn, improve wildlife management and conservation practices. The task of estimating ecological state variables is, however, challenging, especially for rare or elusive species such as large carnivores, and requires analytical approaches that account for the fact that not all individuals in a population can be observed [1]. Regardless of methodology, the quality of model-based inference is directly related to data quality, which can be an issue for elusive species, especially when resources for monitoring are limited. This has led to an emphasis on developing methods that integrate multiple data sources [2, 3] and, importantly, to a realization that the vast amounts of data regularly collected outside of formal scientific studies, unstructured or opportunistic data, are a potentially valuable data source [4, 5]. Although the majority of data integration methods have focused on improving estimates of species distribution and temporal population trends, opportunistic data has great potential to improve inferences about spatial ecological processes.

Integrated population models (IPMs: [2, 6, 7]) provide a statistical framework for jointly modeling count data and demographic data, typically resulting in improved inferences about the mechanisms regulating population dynamics. As a result, there has been continued development of more general ‘integrated data models’ that seek to combine any independent data sources that arise from the same ecological process [3]. For example, occupancy and abundance are two directly related ecological state variables, and joint analysis of capture-recapture and occupancy data has been shown to improve estimates of abundance [8, 9], density [10], and even colonization-extinction dynamics and dispersal [11]. A common feature of the majority of studies that use multiple data sources, aside from improving parameter precision, is that each independent data set is collected according to a well-defined sampling protocol, i.e., an integration of structured data sets. The value of integrated models that use unstructured or opportunistic data, such as that collected by many citizen scientists, is yet unclear. For example, van Strien et al. [12] argue that opportunistic data represents an important data source that, if analyzed appropriately, can yield improved inferences about temporal trends in occurrence, while Kamp et al. [13] caution against its use, demonstrating that citizen science data were unable to detect significant species declines. Regardless, with the rapid increase in citizen science initiatives, finding innovative ways to utilize opportunistic data will broaden the scope of ecological enquiry that can be addressed within a single analytical framework [3].

Spatial capture-recapture (SCR) methods [1416] are now well-established in applied ecology and produce estimates of population density using spatial encounter history data. Using spatial patterns of observations to account for heterogeneity in detection due to differences in trap exposure (i.e. distance between traps and individual home ranges), and treating space as an explicit model component, SCR produces unbiased estimates of density and space use across a range of conditions (e.g. [15, 1719]). Moreover, SCR has been used to estimate density for elusive species from data collected using a variety of field methodologies including camera traps [20], hair snares [21], and scat surveys [22]. A core component of SCR is an explicit model for space use that relates encounter probability to the distance from an individual’s activity center via the estimation of a spatial scale parameter σ [15]. Estimating σ yields an explicit definition of the effective sampling area, and as a result, absolute density can be directly estimated. If follows that to estimate density well, σ must also be well estimated. As with other statistical methods, the precision of SCR-derived estimates of space use and density depend on sample sizes, specifically, but not solely, the number of unique spatial locations that individuals are observed at (spatial recaptures). Thus, adding additional spatial information should, in theory, lead to improved inference about space use, and in turn, density. For example, Gopalaswamy et al. [23] increased the number of spatial recaptures by integrating camera trap and scat collection data which resulted in more precise estimates of density, while Royle et al. [24] and Sollmann et al. [25] demonstrated that space use (σ) and density are estimated with higher precision when telemetry data are used in addition to traditional capture-recapture data (See also Table 1).

thumbnail
Table 1. Summary of contributions that provide an integrated framework for spatially-referenced individual data.

Systematic data are collected under specific study designs: spatial capture-recapture (SCR), telemetry, and counts or binary detections (survey). Parameter shared: ψ, Data Augmentation parameter; σ, scale parameter of the observation model; ϕ, survival probability; α, effect of a landscape covariate on the relative probability of use; δ, individual-level recruitment probability.

https://doi.org/10.1371/journal.pone.0185588.t001

Interestingly, in the context of SCR, telemetry data require no information about sampling effort because observed locations provide representative information only about the spatial scale parameter (σ), and thus any amount of telemetry data are likely to be informative about space use. So, while the inability to quantify observer effort and bias is often cited as a major limitation of data collected by citizen scientists [29], it appears that such opportunistic data lends itself naturally to integration within the SCR modeling framework. Specifically, when opportunistic observations can be made of individually-identifiable animals, i.e., via direct or indirect recognition of naturally marked, collared or tagged individuals or the collection of DNA yielding biological samples such as hair or faeces, the locations of those observations are informative about space use. As a consequence, opportunistic observations have the potential to improve estimates of spatial parameters in SCR, and have the added benefit of potentially increasing the geographic extent of monitoring studies significantly.

Although a joint analysis of SCR, telemetry and opportunistic data is possible, such an approach assumes that data sources with shared parameters are indeed informative of the same process. For example, estimating a single spatial scale parameter, σ, from multiple data sources requires that the ecological process, space use, gives rise to the same spatial distribution of observations, i.e., the parameter is consistent. Despite the increasing use of integrated modeling approaches, this issue has received relatively little attention. Popescu et al. [30] demonstrated that information on space use obtained from camera trap and telemetry data are in general agreement and advocate for integrating additional sources of information when investigating space use. There are cases where the process that gives rise to spatial observation patterns differ, particularly when using both direct and indirect observations, or when the data types are associated with specific behaviors, for example territory marking at tree rubs compared to larger scale foraging patterns. So, because we can integrate multiple data sources does not necessarily mean that we should, and there is a need to find ways of testing for consistency in wildlife data.

Here we analyze data from a reintroduced brown bear Ursus arctos population in the central Alps, one of the most populated regions to be occupied by brown bears [31, 32]. This is a region where bear-human interactions are highly probable and any perceived threat is considered a key factor in determining the success or failure of the reintroduction [33]. First, we describe the brown bear sampling design and data, the classical SCR model, and how opportunistic data can be formally incorporated into spatial capture-recapture methods. We then demonstrate an application of a model-based test for parameter consistency across data types using Bayesian variable selection.

Materials and methods

This study used data from GPS collared brown bears which were captured and collared with the permission of the Italian Ministry of the Environment.

Study area and population

This study was conducted in 2013 in the Italian Alps, an area characterized by a mosaic of natural and human-modified habitats, with a landscape fragmented by urban areas and roads. Elevation ranges from 65 m to more than 3900 m a.s.l., with submontane, montane and subalpine vegetation covering areas below 2000 m, and human population density concentrated below 1000 m [33]. Between 1999 and 2002, nine bears (three males and six females, 3–6 years old) were released in the area as part of a reintroduction project to establish a self-sustained population [33, 34]. At the time, the original brown bear population consisted of at least three animals, which were assumed to have died without any genetic exchange with the translocated bears and their progeny [35].

Brown bear data

Non-invasive genetic sampling.

Bear hair samples were collected from 99 hair traps and 89 rub trees. Hair traps consisted of a strand of barbed wire wound around trees at c. 50 cm above ground level enclosing an area of c. 25 m2 with scent lure placed in the center [36]. They were set from 15 May to 31 July, checked on five occasions, and the number of days between occasions ranged from 3 to 10 days (Fig 1). Rub trees, barbed wire wrapped around trees, were monitored during the same period and were checked twice, first after six days and then after a further four days (Fig 1). All hairs on the hair trap and rub tree barbed wire were collected during each visit ensuring that only newly deposited hairs were collected in subsequent visits. Because the hair trap and rub tree data were collected according to a specific protocol, we refer to this structured data as traditional SCR data, or simply ‘SCR data’. In addition to the structured data collection, opportunistic hair and feces data were also collected [32, 35, 37, 38]. Following notification by third parties (typically members of the public), opportunistic sampling of hair and feces was carried out by agency personnel at sites where bear damage occurred, e.g. depredation on livestock, beehives and/or crops [38]. We refer to this data as ‘opportunistic data’.

thumbnail
Fig 1. Timeline of data collection.

The diagram shows the period when SCR data were systematically collected (i) using an array of hair traps checked on five occasions of variable length (black blocks), and (ii) from rub trees checked for hairs in two period (grey blocks). Telemetry data were thinned by randomly selecting one record per day, and opportunistic recovery of biological samples was performed in 23 days.

https://doi.org/10.1371/journal.pone.0185588.g001

Biological samples were genetically analyzed for individual identification. For a detailed description of DNA extraction methods, PCR protocols, protocols for individual identification, and molecular sexing, see [32, 35]. We considered only data belonging to the non-cub part of the population and successfully identified a total of n = 22 individuals (12 females and 10 males). Of the 22 individuals, 19 were detected using hair traps; two males and one female were sampled only on rub trees. During the period of trap deployment, 11 of the 22 individuals (four females and seven males) were detected opportunistically resulting in an additional 30 unique spatial locations (Figs 1 and 2a, S1 Fig).

thumbnail
Fig 2. Spatial distribution of the three types of data available for the brown bear population in the central Alps.

(a) Distance from the point were founders were released (in km) and location of bear captures from systematic sampling with hair traps and rub trees (SCR), telemetry and opportunistic records. (b-c) Location of the records for the two collared individuals from which telemetry information was derived. Grey dots indicate the location of all observed individuals.

https://doi.org/10.1371/journal.pone.0185588.g002

Telemetry.

Two bears, one male and one female, were tracked during the hair trap and rub tree sampling period using Global Positioning System (GPS) collars (Vectronic GPS-GSM collars, Vectronic Aerospace GmbH, Berlin, Germany). GPS collars collected positions at different intervals ranging from 10 min to 1 h. For the analysis we selected one random record per day per individual, giving a total of 143 unique telemetry locations (74 and 69 for the male and female respectively). The collared female was detected at a hair trap but never detected at rub trees or opportunistically (Fig 2b). The collared male was never detected with hair traps, but was detected once at a rub tree and was observed opportunistically five times (Fig 2c).

Data analysis

Spatial capture-recapture data.

Spatial capture recapture models are hierarchical models [39] that describe distance-dependent encounter probabilities (the observation process), and the spatial distribution of individuals across the landscape (density, the ecological state process). We adopt a Bayesian analysis of the model [21, 40] and assume that individual encounter data, yijk, representing whether or not individual i was detected in trap j in occasion k, are Bernoulli random variables with success probability pijk, i.e., the encounter probability: (1) Encounter probabilities in SCR are assumed to decline with distance between a trap (x) and an individuals activity center (s) according to some decreasing function; here we use the half-normal encounter model and allow for sex-specific variation in the parameters: (2) where σsex is the sex-specific spatial scale parameter that determines the decrease in encounter probability with distance between trap j and individual i’s activity center (d(xj, si)). The parameter p0,ijk is the baseline encounter probability and can itself be modeled as a function of individual- (i), trap- (j) and occasion- (k) specific covariates. Specifically, we modeled the baseline encounter probability as a function of sex, trap type (trap: hair trap or rub tree), and, to account for the different time elapsed between consecutive sample occasions in each trap, time since last check (time), using standard logistic regression: (3) where γ0 is the baseline encounter probability for females at hair traps, γ1 is the difference between male and female detections at hair traps, γ2 is the additive effect of trap type (i.e., the difference in detectability between hair traps and rub trees), and γ3 measures the change in detectability as time elapses since a trap was checked.

The second component of the SCR model is a point process model that describes the distribution of individual activity centers, si, within a defined region, , which should be large enough to contain all plausible activity centers of all observed individuals [15]. We were particularly interested in modeling density as a function of spatially varying covariates, i.e., as an inhomogeneous point process, and so we used a discrete representation of defined as the center points of each pixel. In other words, the state-space was defined as a raster, and individual activity centers (si) were associated with pixel centroids. In our case, where we are considering a population that was established from a single release point, we modeled variation in density as a function of the distance from the point that the founding population was released between 1999 and 2002 (d.release; see ‘Study area and population’ and [41]). Using a binomial point process model, the per-pixel intensity, μ(s), is modeled as a log-linear function of ‘d.release’: (4) for pixel g = 1, …, nG, with total number of pixels nG = 506, and the probability that an individual activity center is located in a pixel, π(s) is given by: (5)

This is the standard formulation of a Bayesian SCR model with a sex-specific half-normal encounter probability model, an inhomogeneous point process density model, and the estimation of sex-specific total population size N using data augmentation (see Chapters 7 and 10 in [15]). Sex was known for all observed individuals but not for unobserved (augmented) individuals so was modeled as an individual random effect to be estimated: sexi ∼ Bern(ωsex), where ωsex is the population-level sex ratio. We expected detectability to vary between sexes and trap type, and to be positively related to the time since last check. We also expected space use, σ, to vary by sex. Finally, we expected density to decline with distance from the release point.

The spatial encounter histories for the standard SCR analysis were generated for detections from a J = 188-trap array consisting of hair traps and rub trees across K = 5 sampling occasions (Fig 1 and S2 Fig). Data were formatted in a 3-dimensional M × J × K array, YSCR, where n is the number of observed individuals and Mn is the number of augmented ‘all-zero’ encounter histories, a proportion of which are the estimated unobserved individuals. The additional data required to fit the model are: the coordinates of each hair trap and tree rub, a vector of sex determination of each individual, a J × K trap operation matrix which is a binary indicator denoting whether each trap was operational during sampling occasion, and the J × K matrix of ‘time since last check’ covariates, which were scaled to have zero mean and unit variance (S2 Fig).

Telemetry and opportunistic data.

Unlike the traditional SCR data described above, both telemetry locations, Itel and opportunistic data Iopp are not restricted to trap locations and therefore provide important additional information about individual movement, i.e., both are direct observations of space use [24]. We combine the individual telemetry and opportunistic locations and refer to them collectively as Ii for individual i = 1, …, n. These additional locations can be modeled using a bivariate normal ‘movement model’ with mean si and variance-covariance matrix Σ with variance and zero covariance [26, 27]: (6) where, (7) The parameters of this model can be related directly to the SCR half-normal encounter probability model (Eq 2) through the shared parameters s and σ, which means that telemetry data, opportunistic data and traditional SCR data can be jointly modeled, each contributing to the estimation of the latent activity centers and the spatial scale parameter σ.

The telemetry and opportunistic data, for Rtel = 143 and Ropp = 30 locations at which data were available, were formatted in two R × n × K arrays each, one containing x-coordinates and the other containing corresponding y-coordinates, for the n observed individuals and K occasions. This array structure allows the unstructured data (telemetry and opportunistic) to be related to the SCR data in the integrated model (S2 Fig). In addition, an M × K matrix denotes the number of unstructured locations for each individual in each occasion (S2 Fig).

In our case study telemetry data were available for two individuals only. We note, however, that the lack of this type of data is quite common in ecological and wildlife management studies (e.g. 3 collared individuals in [24] and [27]).

Model-based test for data consistency.

First we evaluated the relative value of adding additional data sources by comparing estimates of density and space use obtained from the traditional SCR model with estimates obtained with the addition of telemetry data only, opportunistic sightings data only, and then both telemetry and opportunistic sightings data. Note that these models all assume consistency in the estimation of the spatial scale parameter σsex between structured (SCR) and unstructured (telemetry and opportunistic) data. To test the consistency assumption, we fit a fifth model to formally evaluate whether estimates of σ vary by data type. In summary, we fit the following five models:

  1. Traditional SCR data only (data from hair traps and tree rubs)
  2. SCR data + telemetry data (assuming data consistency)
  3. SCR data + opportunistic sightings (assuming data consistency)
  4. SCR data + telemetry + opportunistic sightings (assuming data consistency)
  5. SCR data + telemetry + opportunistic sightings (data type-specific σtype).

Models 1 to 4 are described above. Model 5 uses a Gibbs variable selection approach (GVS: [42, 43]) to estimate the degree of support for the inclusion of a data type effect on σ, i.e., support for the hypothesis that estimates differ by data type. To do so, we formulate the model for σ as follows: (8) where θσ is the intercept of the log-linear model for σ, and because sexi = 0 for female and type = 0 for traditional SCR data, exp(θσ) is the female σ for structured data (σst,f). The parameters θ1 and θ2 are the sex and data-type effects (differences) on σ, where estimates not overlapping 0 denote important effects. An important note is that the models that assume data consistency (models 2, 3, and 4) are special cases of this model but with θ2 fixed at 0. In model 5, the model where the difference between data types is examined, θ2 is an estimated parameter and is multiplied by an ‘inclusion parameter’ w, a latent binary variable with a Bernoulli prior distribution with probability 0.5. The posterior distribution of the inclusion parameter is the probability that there is a difference between types and therefore provides explicit inference about data-specific parameter consistency, and θ2|w = 1 is an estimate of the data type effect on σ.

For each model, we estimated sex-specific total population size, Nsex, and sex-specific (and for model 5, data type-specific) spatial scale parameters, σ, and compared point estimates (posterior median) and precision (95% Bayesian Credible Interval width, BCI from here) of the estimated parameters. We adopted a Bayesian analysis of the SCR models using Markov chain Monte Carlo (MCMC) using the program JAGS [44] implemented in R [45]. In all models, we used an uninformative Normal(0, 100) priors for the parameters γ and β in Eqs 3 and 4. We used a Normal(0, 15) prior for the intercept of the log-linear model for sex- and data-specific σ, and a Uniform(-3, 3) prior for the sex and data type regression coefficients θ1 and θ2, respectively. In the model with variable selection to evaluate parameter consistency, a ‘slab and spike’ prior was used to improve the mixing and convergence time of the MCMC algorithm [42] (see S1 Text for details).

After testing a range of resolution values for the state-space, , we used a resolution of 4 × 4 km, a value that was small enough to yield stable parameter estimates, and large enough to ensure the model was computationally tractable. To ensure the state space was large enough to contain all plausible activity centers, we used a 21 km buffer around the most extreme coordinates of all the data (telemetry, opportunistic and trapping data, Fig 2a). Data were augmented with Mn ‘all-zero’ encounter histories, where M = 300. Summaries of the posterior distribution were calculated from 30,000 post-burn-in posterior samples (burn-in = 3,000 iterations). The diagnostics [46] used to assess convergence were < 1.02 for all parameters. The code for the fully-integrated models where (i) data consistency is assumed, (ii) data consistency is tested, or (iii) data inconsistency is accounted for, is available in S2 Text.

Results

Estimates of the parameter relating density to distance to the reintroduction point (β1) were negative under all models, and although there was some variation in the strength of the effect, this result supports the hypothesis that density decreased with distance from the point were founders were released (S1 Table). The estimated sex ratio in the population, ωsex, did not vary significantly between the five models based on 95% Bayesian Credible Intervals and was lowest in SCR-only model (0.22, BCI: 0.09–0.41) and highest in the SCR + telemetry model (0.36, BCI: 0.19–0.57). Across all models, detectability was higher for males, higher at hair traps, and increased with increasing time between checks (S1 Table).

Assuming data consistency in the model

Overall, when consistency was assumed, integrating all available sources of information (traditional SCR, telemetry and opportunistic data) produced more precise estimates of population size and spatial scale parameters when compared to models using either SCR data alone or integrating a single additional data source (Fig 3, Tables 2 and 3). In particular, the gain in precision achieved by jointly modeling all three data types was particularly relevant for sex-specific population size estimates (Nsex). In addition to the increase in precision, integrating additional sources of information resulted in a shift in the median abundance point estimates: from 16 (BCI: 8–30) under the SCR-only model to 13 (BCI: 7–22) under the fully integrated model for males (a 19% difference), and from 58 (BCI: 24–148) to 25 (BCI: 13–52) for females (a 57% difference).

thumbnail
Fig 3. Comparison of posterior estimates for population size (N) and spatial scale parameters of the gaussian kernel (σ) from the different models.

Mean and 95% Bayesian Credible Interval achieved using structured (SCR) data only, or integrating them with unstructured data, i.e. telemetry (‘tel’) and opportunistic (‘opp’) data, available for the brown bear population in the Italian Alps. Filled points correspond to models with implied data consistency, empty points refer to the fully integrated model that account for data inconsistency.

https://doi.org/10.1371/journal.pone.0185588.g003

thumbnail
Table 2. Posterior estimates of population size achieved using structured (SCR) data alone or integrated with unstructured (telemetry and opportunistic; tel, opp) information available for the brown bear population in the Italian Alps.

Parameters estimates are reported for males (Nm) and females (Nf).

https://doi.org/10.1371/journal.pone.0185588.t002

thumbnail
Table 3. Posterior estimates of spatial scale parameter achieved using structured (SCR) data alone or integrated with unstructured (telemetry and opportunistic; tel, opp) information available for the brown bear population in the Italian Alps.

Parameters are denoted as follows (m = male, f = female): sex-specific spatial scale parameter shared among different data types, σsex; sex-specific spatial scale parameter for structured and unstructured data, σst,sex and σun,sex, respectively.

https://doi.org/10.1371/journal.pone.0185588.t003

Precision gains in estimates of σ were minimal when adding opportunistic data (fewest additional data points) and highest when integrating telemetry information only (most additional data points), and was higher for males than females (Fig 3, Table 3). As with the estimates of bear population size, the integration of additional information led to a change in the point estimates of σ; compared to the SCR-only model, there was a noticeable increase in the scale of space use when any of the additional data was used. The female 95% home range size estimated from the half normal encounter model under the SCR-only model was 218 km2 (BCI: 114–421) whereas for the fully integrated model, the estimate was 1379 km2 (BCI: 1107–1719). Conversely, estimates of male space use, σm, were consistent across models (Fig 3, Table 3), as were the corresponding 95% home range size estimates: 1836 km2 (BCI: 1127–3062) under the SCR data only and 2029 km2 (BCI: 1697–2424) under the fully integrated model.

Testing and accounting for data inconsistency

There was strong support for the need to account for differences in the spatial scale parameter as estimated from structured vs. unstructured data (model 5); posterior inclusion probability of the data-type effect was Pr(w = 1) = 0.98. Population size estimates were similar to those obtained with the fully integrated model where data consistency was assumed (14 males, BCI 10–25; 28 females, 16–62; Fig 3, Table 3). The inclusion of a sex-specific data type effect on the spatial scale parameter supported the inconsistency of information on space use provided by structured and unstructured data, with significantly larger σ values, and thus larger home range size estimates, derived from unstructured data than those obtained from structured data. Specifically, the difference between data type-specific spatial scale parameters was different from zero (θ2 = 0.389, BCI: 0.187–0.570; Fig 3, S1 Table). Precision in data type-specific σ estimates was similar in both fully integrated models. Sex-specific 95% home range size estimates derived from structured and unstructured data were 1103 km2 (BCI: 812–1521) and 2385 km2 (BCI: 1928–2940) for males, and 720 km2 (BCI: 474–1090) and 1542 km2 (BCI: 1234–1924) for females, respectively.

Discussion

We developed a formulation of a spatial capture-recapture model that integrates multiple data sources and, importantly, allows formal testing and accounting for parameter consistency among data types which, as a result, improves inferences about density and space use. We illustrated the approach using empirical data from a reintroduced population of brown bears in the Italian Alps, a low density species of conservation concern for which available data are sparse. Specifically, we jointly analyzed traditional SCR data, collected using systematic sampling of hair traps and rub trees, satellite telemetry data, and opportunistic recovery of biological samples. Comparing estimates from models ranging from traditional SCR to a fully integrated SCR model, we demonstrated that the addition of unstructured data results in increased precision in estimates of population size and space use. Most importantly, we identified and explicitly modeled data-type-specific inconsistency in estimates of spatial model parameters which, if ignored, can produce biased estimates of space use and abundance. Our approach provides a model-based procedure for valid integration of different sources of information in ecological studies of spatial processes.

Integrating spatial data

Estimates of male population size were stable across all models, with highest precision observed for the fully integrated model, i.e., the model with most spatial data, and lowest for the SCR-only model, which had the fewest spatial locations (Fig 3). For females, estimates of population size from models with additional (unstructured) data were comparable, but were different from the SCR-only estimates, both in terms of precision and point estimates. The addition of telemetry and opportunistic data increased the precision in estimates of female population size when compared to the SCR-only model, but overall, precision was lower than for males.

Although the number of individuals observed for the two sexes was similar (12 females and 10 males), there were fewer spatial recaptures at hair traps or rub trees of females (mean 2.1, min 1, max 4) than of males (mean 8.6, min 1, max 21) (S1 Fig). This suggests that smaller SCR data sets benefit most from the approach that integrates high quality spatial information from even a small number of telemetered or opportunistically observed individuals, or conversely, they may be less stable and more sensitive to combining data. It is encouraging however, that population size estimates based on the two independent unstructured data sources produce similar estimates of abundance, suggesting the addition of opportunistic information is beneficial when consistent with information on space use provided by other unstructured data, such as telemetry. It is important to note, however, that while we provide a much needed framework for evaluating parameter consistency across data types, our approach does not assess parameter sensitivity. The test for consistency is, in our opinion, a welcome first step towards better understanding the consequences of combining multiple data sources [47], and relatively straightforward to implement in practice. Evaluating parameter sensitivity on the other hand would require a full-scale simulation, especially in the current case where the different data types represent the extremes of a spectrum: SCR sampling typically generates sparse spatial information on several-to-many individuals, while telemetry studies typically generate large amounts of fine scale spatial data on relatively few individuals.

Unsurprisingly, the addition of high temporal resolution telemetry data results in a larger precision improvement compared to the relatively sparse opportunistic data (Fig 3). However, the addition of even a few opportunistic locations produces an increase in precision relative to inference from the SCR-only model (Fig 3), suggesting that even in the absence of telemetry data, opportunistic data can be important sources of spatial information. Of course, in our application the true values are unknown and accuracy cannot be evaluated, but our findings are consistent with previous population estimates for the same time period [38], and support the notion that estimated female population size derived from the SCR-only is indeed unrealistically high.

As with abundance, estimates of sex-specific σ were more precise under the integrated model (Fig 3). The most notable difference was between estimates of female space use from the three integrated models and the SCR-only model. These differences arise for the larger spatial distribution of telemetry locations compared to hair trap and tree rub (SCR) data, and they are likely related to the link between detector type and behavior. The SCR data were comprised of a few spatially clustered encounters which may be the result of hair deposition patterns related to territoriality, potentially at the core of a home range. Telemetry and opportunistic data, on the other hand, are passive location observations that do not require physical deposition of hair, and that could reflect more general and wider ranging space use (Fig 2b). Estimates of abundance are explicitly linked to estimates of space use, and the difference in the spatial scale parameter estimates, which are more pronounced for females, gives rise to the resulting differences in abundance, but perhaps more importantly, is indicative of potential inconsistency in data-specific parameter estimates. In general, these findings further illustrate the fact that integrated data models allow sex-specific effects on detectability, and consequently abundance, to be modeled, which can often be limited by insufficient observations of one sex or the other when traditional SCR are sparse [48, 49].

Testing and accounting for data inconsistency

To deal with the apparent sensitivity to the data integration shown by the marked shift in estimates of female population size and spatial scale parameter, we developed a formal model-based test for parameter consistency. Specifically, we used a Bayesian variable selection approach to estimate the degree of support for data-type-specific parameter estimates, and in doing so, estimate the differences in the spatial scale parameter. In practice, this allows for a convenient and formal assessment of the strength of evidence for whether a parameter, σ in this case (although other parameters could be tested where appropriate), should be shared across data types or instead be data type-specific.

In our case, the fully integrated model with an effect of data type on σ revealed a significant difference in the posterior estimates of the spatial scale parameter, suggesting an underestimation of the home range size using hair trap and tree rub data alone. In fact, home range size estimated from structured data was less than half of the dimension estimated using unstructured information in both sexes. On the other hand, population size estimates did not strongly differ when accounting for data inconsistency in the fully integrated model, although estimates in the latter case were more conservative. These results suggest that careful consideration should be given to the biological interpretation of space use in SCR models because the data are related closely to behavior, e.g., territory marking versus freely roaming. This is likely the case in this study, as suggested by fact that the direction of change in the spatial scale parameter is the same for both sexes.

We suspect that the inconsistencies observed in this study are due to the sampling-behaviour interactions, i.e., the fact that in order to be detected, animals must exhibit a certain behavior, whereas both telemetry data and opportunistic sightings are observation of movement around a home range. Thus, inconsistencies are expected when sampling methods are geared towards different behaviors. Apparent inconsistency may also arise in the presence of transient individuals with non-stationary home ranges, e.g. [50]. Thus, it is clear that while integrated data models are appealing, it is important to acknowledge the underlying biology that gives rise to apparently similar data generated from different sampling schemes.

Testing and accounting for data inconsistency in the same modeling framework used to combine structured and unstructured spatially-referenced data may therefore avoid the risk of spurious estimates of space use. This has important implications in the definition of the study design, since estimates of the spatial scale parameter (and ultimately home range size) are used to calibrate trap spacing and in turn the extent of the trap array [15].

Conclusion

We provide evidence that integrating SCR, telemetry, and unstructured opportunistic data, by conceptually treating opportunistic records as thinned telemetry data, improves inference precision on abundance and space usage, which are key population-level parameters to inform conservation decisions of elusive and difficult-to-study species. However, care must be taken to assess potential inconsistencies in spatial information provided by the different data sets, where telemetry is both the most informative source of space use but also often available only for a few individuals, whose movement may not be representative of population space use. Understanding how animal density changes in space and how the latter is used is crucial when addressing practical issues in population management and conservation [51]. To this end, the use of opportunistic information increases availability of spatially-referenced individual information, that can be suitably modeled along with other data within a unified framework, thus reducing the need for additional invasive methods. Despite the fact that we can incorporate unstructured data, and thereby increase precision, we should always previously verify whether data are consistent with each other, and thus suitable for integration. The model-based approach here presented offers a natural way of formally testing data consistency.

Supporting information

S1 Fig. Location of bear captures from systematic sampling with hair traps and rub trees (SCR), telemetry and opportunistic records.

Grey dots indicate the location of all observed individuals. Bear ID is reported on the top of each plot.

https://doi.org/10.1371/journal.pone.0185588.s001

(PDF)

S2 Fig. Graphical representation of the data involved in the integrated analysis.

Circles represent estimated parameters. Observed data for n individuals, detected during K visits and members of the population of size N, were augmented with Mn all-zero detections (YSCR matrix). SCR data were collected at J sites, consisting of Jh hair traps and Jr rub trees. Data set names in Courier font correspond to the names used in the model code. Coordinates, trap deployment, and (standardized) time since last check data sets for the J = Jh + Jr traps are denoted by SCR.traps, active, and time.elapsed.sc labels, respectively. Raster data contain information on the distance from the point were founders were released for each of the nG pixels (d2core) and was used to model density. Telemetry and opportunistic data were formatted in the same way, with an augmented matrix for number of records available for individual i in occasion k (n.obs.TEL and n.obs.OPP, respectively) and the x (TEL.y_x, OPP.y_x) and y (TEL.y_y, OPP.y_y) coordinates of those records for each individual in each of the Rtel or Ropp locations and occasion k.

https://doi.org/10.1371/journal.pone.0185588.s002

(PDF)

S1 Text. Details on the Bayesian variable selection.

https://doi.org/10.1371/journal.pone.0185588.s003

(PDF)

S2 Text. R and BUGS codes for the fully integrated models.

https://doi.org/10.1371/journal.pone.0185588.s004

(PDF)

S1 Table. Posterior parameter estimates achieved using structured (SCR) data alone or integrated with unstructured (telemetry and opportunistic; tel, opp) information available for the brown bear population in the Italian Alps.

Parameters are denoted as follows: effect of sex on the spatial scale parameter for structured data, θ1; effect of data type on the spatial scale parameter, θ2; baseline encounter probability (p0) intercept, γ0; effect of being male on p0, γ1; effect of trap type on p0, γ2; effect of time since last check on p0, γ3; density intercept, β0; effect of distance from the release point on density, β1; data augmentation parameter, ψ; probability of being a male, ωsex.

https://doi.org/10.1371/journal.pone.0185588.s005

(PDF)

Acknowledgments

We thank Daniel W. Linden and one anonymous referee for their constructive comments that helped improve the manuscript. We would also like to thank the Autonomous Province of Bolzano, the ISPRA, the personnel of the Servizio Foreste e Fauna of the Autonomous Province of Trento, of the Adamello Brenta Natural Park, and of the Stelvio National Park. We also thank the many forestry wardens and volunteers for field support, Aaron Iemma for IT assistance, and Ben Augustine for discussions about some the technical details.

References

  1. 1. Williams BK, Nichols JD, Conroy MJ. Analysis and management of animal populations. San Diego: Academic Press; 2002.
  2. 2. Schaub M, Abadi F. Integrated population models: a novel analysis framework for deeper insights into population dynamics. Journal of Ornithology. 2010;152:227–237.
  3. 3. Gimenez O, Buckland ST, Morgan BJ, Bez N, Bertrand S, Choquet R, et al. Statistical ecology comes of age. Biology letters. 2014;10(12):20140698. pmid:25540151
  4. 4. Dickinson JL, Shirk J, Bonter D, Bonney R, Crain RL, Martin J, et al. The current state of citizen science as a tool for ecological research and public engagement. Frontiers in Ecology and the Environment. 2012;10(6):291–297.
  5. 5. Newman G, Wiggins A, Crall A, Graham E, Newman S, Crowston K. The future of citizen science: emerging technologies and shifting paradigms. Frontiers in Ecology and the Environment. 2012;10(6):298–304.
  6. 6. Besbeas P, Freeman SN, Morgan BJT, Catchpole EA. Integrating mark-recapture-recovery and Census Data to Estimate Animal Abundance and Demographic Parameters. Biometrics. 2002;58(3):540–547. pmid:12229988
  7. 7. Tenan S, Adrover J, Navarro AM, Sergio F, Tavecchia G. Demographic consequences of poison-related mortality in a threatened bird of prey. PloS ONE. 2012;7(11):e49187. pmid:23155464
  8. 8. Conroy MJ, Runge JP, Barker RJ, Schofield MR, Fonnesbeck CJ. Efficient estimation of abundance for patchily distributed populations vias two-phase, adaptive sampling. Ecology. 2008;89(12):3362–3370. pmid:19137943
  9. 9. Blanc L, Marboutin E, Gatti S, Zimmermann F, Gimenez O. Improving abundance estimation by combining capture—recapture and occupancy data: example with a large carnivore. Journal of Applied Ecology. 2014;51(6):1733–1739.
  10. 10. Chandler RB, Clark JD. Spatially explicit integrated population models. Methods in Ecology and Evolution. 2014;5(12):1351–1360.
  11. 11. Sutherland C, Elston D, Lambin X. A demographic, spatially explicit patch occupancy model of metapopulation dynamics and persistence. Ecology. 2014;95(11):3149–3160.
  12. 12. van Strien AJ, Swaay CA, Termaat T. Opportunistic citizen science data of animal species produce reliable estimates of distribution trends if analysed with occupancy models. Journal of Applied Ecology. 2013;50(6):1450–1458.
  13. 13. Kamp J, Oppel S, Heldbjerg H, Nyegaard T, Donald PF. Unstructured citizen science data fail to detect long-term population declines of common birds in Denmark. Diversity and Distributions. 2016;22(10):1024–1035.
  14. 14. Efford M. Density estimation in live-trapping studies. Oikos. 2004;106(3):598–610.
  15. 15. Royle JA, Chandler RB, Sollmann R, Gardner B. Spatial Capture-Recapture. Waltham, MA: Academic Press; 2014.
  16. 16. Royle JA, Fuller AK, Sutherland C. Unifying population and landscape ecology with spatial capture—recapture. Ecography. 2017;
  17. 17. Borchers DL, Efford M. Spatially explicit maximum likelihood methods for capture—recapture studies. Biometrics. 2008;64(2):377–385. pmid:17970815
  18. 18. Sollmann R, Gardner B, Belant JL. How does spatial study design influence density estimates from spatial capture-recapture models? PloS ONE. 2012;7(4):e34575. pmid:22539949
  19. 19. Sutherland C, Fuller AK, Royle JA. Modelling non-Euclidean movement and landscape connectivity in highly structured ecological networks. Methods in Ecology and Evolution. 2015;6(2):169–177.
  20. 20. Royle JA, Karanth KU, Gopalaswamy AM, Kumar NS. Bayesian inference in camera trapping studies for a class of spatial capture-recapture models. Ecology. 2009;90(11):3233–3244. pmid:19967878
  21. 21. Gardner B, Royle JA, Wegan MT, Rainbolt RE, Curtis PD. Estimating Black Bear Density Using DNA Data From Hair Snares. The Journal of Wildlife Management. 2010;74(2):318–325.
  22. 22. Fuller AK, Sutherland CS, Royle JA, Hare MP. Estimating population density and connectivity of American mink using spatial capture—recapture. Ecological Applications. 2016;26(4):1125–1135.
  23. 23. Gopalaswamy AM, Royle JA, Delampady M, Nichols JD, Karanth KU, Macdonald DW. Density estimation in tiger populations: combining information for strong inference. Ecology. 2012;93(7):1741–1751. pmid:22919919
  24. 24. Royle JA, Chandler RB, Sun CC, Fuller AK. Integrating resource selection information with spatial capture—recapture. Methods in Ecology and Evolution. 2013;4(6):520–530.
  25. 25. Sollmann R, Tôrres NM, Furtado MM, de Almeida Jácomo AT, Palomares F, Roques S, et al. Combining camera-trapping and noninvasive genetic data in a spatial capture—recapture framework improves density estimates for the jaguar. Biological Conservation. 2013;167:242–247.
  26. 26. Sollmann R, Gardner B, Parsons AW, Stocking JJ, McClintock BT, Simons TR, et al. A spatial mark—resight model augmented with telemetry data. Ecology. 2013;94(3):553–559. pmid:23687880
  27. 27. Sollmann R, Gardner B, Chandler RB, Shindle DB, Onorato DP, Royle JA, et al. Using multiple data sources provides density estimates for endangered Florida panther. Journal of Applied Ecology. 2013;50(4):961–968.
  28. 28. Linden DW, Siren APK, Pekins PJ. Integrating Telemetry Data Into Spatial Capture-Recapture Modifies Inferences On Multi-Scale Resource Selection. bioRxiv. 2017;
  29. 29. Dickinson JL, Zuckerberg B, Bonter DN. Citizen science as an ecological research tool: challenges and benefits. Annual review of ecology, evolution and systematics. 2010;41:149–72.
  30. 30. Popescu VD, Valpine P, Sweitzer RA. Testing the consistency of wildlife data types before combining them: the case of camera traps and telemetry. Ecology and evolution. 2014;4(7):933–943. pmid:24772272
  31. 31. Chapron G, Kaczensky P, Linnell JD, Von Arx M, Huber D, Andrén H, et al. Recovery of large carnivores in Europe’s modern human-dominated landscapes. Science. 2014;346(6216):1517–1519. pmid:25525247
  32. 32. De Barba M, Waits LP, Genovesi P, Randi E, Chirichella R, Cetto E. Comparing opportunistic and systematic sampling methods for non-invasive genetic monitoring of a small translocated brown bear population. Journal of Applied Ecology. 2010;47(1):172–181.
  33. 33. Mustoni A, Carlini E, Chiarenzi B, Chiozzini S, Lattuada E, Dupré E, et al. Planning the Brown Bear Ursus arctos reintroduction in the Adamello Brenta Natural Park. A tool to establish a metapopulation in the Central-Eastern Alps. Hystrix. 2003;14(1–2):3–27.
  34. 34. Dupré E, Genovesi P, Pedrotti L. Studio di fattibilità per la reintroduzione dell’Orso bruno (Ursus arctos) sulle Alpi centrali. Istituto nazionale per la fauna selvatica ‘Alessandro Ghigi’; 2000.
  35. 35. De Barba M, Waits L, Garton E, Genovesi P, Randi E, Mustoni A, et al. The power of genetic monitoring for studying demography, ecology and genetics of a reintroduced brown bear population. Molecular Ecology. 2010;19(18):3938–3951. pmid:20735733
  36. 36. Woods JG, Paetkau D, Lewis D, McLellan BN, Proctor M, Strobeck C. Genetic tagging of free-ranging black and brown bears. Wildlife Society Bulletin. 1999;27:616–627.
  37. 37. Groff C, Bragalanti N, Rizzoli R, Zanghellini P. 2013 Bear report. Forestry and Wildlife Department of the Autonomous Province of Trento; 2014.
  38. 38. Tenan S, Iemma A, Bragalanti N, Pedrini P, Barba M, Randi E, et al. Evaluating mortality rates with a novel integrated framework for non-monogamous species. Conservation Biology. 2016;30(6):1307–1319. pmid:27112366
  39. 39. Royle JA, Dorazio R. Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities. San Diego: Academic Press; 2008.
  40. 40. Royle JA, Young KV. A hierarchical model for spatial capture-recapture data. Ecology. 2008;89(8):2281–2289. pmid:18724738
  41. 41. Murphy SM, Cox JJ, Augustine BC, Hast JT, Guthrie JM, Wright J, et al. Characterizing recolonization by a reintroduced bear population using genetic spatial capture—recapture. The Journal of Wildlife Management. 2016;80(8):1390–1407.
  42. 42. O’Hara RB, Sillanpää MJ. A Review of Bayesian Variable Selection Methods: What, How and Which. Bayesian Analysis. 2009;4:85–118.
  43. 43. Tenan S, O’Hara RB, Hendriks I, Tavecchia G. Bayesian model selection: The steepest mountain to climb. Ecological Modelling. 2014;283:62–69.
  44. 44. Plummer M. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd International Workshop on Distributed Statistical Computing; 2003.
  45. 45. R Core Team. R: A Language and Environment for Statistical Computing; 2012. Available from: http://www.R-project.org/.
  46. 46. Brooks SP, Gelman A. Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics. 1998;7:434–455.
  47. 47. Besbeas P, Morgan BJT. Variance estimation for integrated population models. Advances in Statistical Analysis. 2017;
  48. 48. Sollmann R, Furtado MM, Gardner B, Hofer H, Jácomo AT, Tôrres NM, et al. Improving density estimates for elusive carnivores: accounting for sex-specific detection and movements using spatial capture—recapture models for jaguars in central Brazil. Biological Conservation. 2011;144(3):1017–1024.
  49. 49. Tobler MW, Powell GV. Estimating jaguar densities with camera traps: problems with current designs and recommendations for future studies. Biological Conservation. 2013;159:109–118.
  50. 50. Royle JA, Fuller AK, Sutherland C. Spatial capture—recapture models allowing Markovian transience or dispersal. Population ecology. 2016;58(1):53–62.
  51. 51. Bischof R, Swenson JE, Yoccoz NG, Mysterud A, Gimenez O. The magnitude and selectivity of natural and multiple anthropogenic mortality causes in hunted brown bears. Journal of Animal Ecology. 2009;78(3):656–665. pmid:19220565