Fly-QMA: Automated analysis of mosaic imaginal discs in Drosophila

Sebastian M. Bernasek; Nicolás Peláez; Richard W. Carthew; Neda Bagheri; Luís A. N. Amaral

doi:10.1371/journal.pcbi.1007406

Abstract

Mosaic analysis provides a means to probe developmental processes in situ by generating loss-of-function mutants within otherwise wildtype tissues. Combining these techniques with quantitative microscopy enables researchers to rigorously compare RNA or protein expression across the resultant clones. However, visual inspection of mosaic tissues remains common in the literature because quantification demands considerable labor and computational expertise. Practitioners must segment cell membranes or cell nuclei from a tissue and annotate the clones before their data are suitable for analysis. Here, we introduce Fly-QMA, a computational framework that automates each of these tasks for confocal microscopy images of Drosophila imaginal discs. The framework includes an unsupervised annotation algorithm that incorporates spatial context to inform the genetic identity of each cell. We use a combination of real and synthetic validation data to survey the performance of the annotation algorithm across a broad range of conditions. By contributing our framework to the open-source software ecosystem, we aim to contribute to the current move toward automated quantitative analysis among developmental biologists.

Author summary

Biologists use mosaic tissues to compare the behavior of genetically distinct cells within an otherwise equivalent context. The ensuing analysis is often limited to qualitative insight. However, it is becoming clear that quantitative models are needed to unravel the complexities of many biological systems. In this manuscript we introduce a computational framework that automates the quantification of mosaic analysis for Drosophila imaginal discs, a common setting for studies of developmental processes. The software extracts quantitative measurements from confocal images of mosaic tissues, rectifies any cross-talk between fluorescent reporters, and identifies clonally-related subpopulations of cells. Together, these functions allow users to rigorously ascribe changes in gene expression to the presence or absence of particular genes. We validate the performance of our framework using both real and synthetic data. We invite interested readers to apply these methods using our freely available software.

Citation: Bernasek SM, Peláez N, Carthew RW, Bagheri N, Amaral LAN (2020) Fly-QMA: Automated analysis of mosaic imaginal discs in Drosophila. PLoS Comput Biol 16(3): e1007406. https://doi.org/10.1371/journal.pcbi.1007406

Editor: Pedro Mendes, University of Connecticut School of Medicine, UNITED STATES

Received: September 14, 2019; Accepted: January 27, 2020; Published: March 3, 2020

Copyright: © 2020 Bernasek et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available in a public data repository hosted by Northwestern University. DOI: https://doi.org/10.21985/N2F207.

Funding: SMB and LANA were supported by the John and Leslie McQuown Gift. RWC was supported by NIH R35GM118144 (https://www.nih.gov). LANA, NB, and RWC were supported by NSF 1764421 (https://www.nsf.gov). LANA, NB, and RWC were supported by Simons Foundation 597491 (https://www.simonsfoundation.org). NP was supported by the HHMI Hanna H. Gray Fellowship (https://www.hhmi.org/programs/hanna-h-gray-fellows-program). In all cases, the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Quantification will be essential as biologists study increasingly complex facets of organismal development [1]. Unfortunately, qualitative analysis remains common because it is often difficult to measure cellular processes in their native context. Modern fluorescent probes and microscopy techniques make such measurements possible [2–4], but the ensuing image analysis demands specialized skills that fall beyond the expertise of most experimentalists. Automated analysis strategies have addressed similar challenges in cytometry [5–7], genomics and transcriptomics [8–11], and other subdisciplines of biology [12, 13]. Image analysis has proven particularly amenable to automation, with several computer vision tools having gained traction among biologists [14–17]. These platforms are popular because they increase productivity, improve the consistency and sensitivity of measurements, and obviate the need for specialized computational proficiency [18–20]. Designing similar tools to help biologists probe and measure developmental processes in vivo will further transform studies of embryogenesis and development into quantitative endeavors.

Developmental biologists study how the expression and function of individual genes coordinate the emergence of adult phenotypes. They often ask how cells respond when a specific gene, RNA, or protein is perturbed during a particular stage of development. Cell response may be characterized by changes in morphology, or by changes in the expression of other genes (Fig 1A). Experimental efforts to answer this question were historically stifled by the difficulty of isolating perturbations to a single developmental context, as the most interesting perturbation targets often confer pleiotropic function across several stages of development and can trigger early embryonic lethality [21–23].

Download:

Fig 1. Perturbing gene expression via mitotic recombination.

Experimental framework using mitotic clones to test whether or not regulatory interactions occur between a perturbation target and reporter of interest. Blue and green markers represent the respective genes encoding the perturbation target and the reporter. (A) A perturbation-induced decrease in reporter levels would confirm that regulation occurs. (B) Mitotic recombination generates clonal subpopulations carrying zero, one, or two copies of the gene encoding a perturbation target. Black lines depict a genetic locus. Only genes downstream of the recombination site are subject to recombination. Red markers represent a gene encoding a clonal marker used to identify the resultant clones. Red shading of large oval reflects relative clonal marker fluorescence level.

https://doi.org/10.1371/journal.pcbi.1007406.g001

Mosaic analysis addressed this challenge in Drosophila by limiting perturbations to a subset of cells within the imaginal discs of the larva [24, 25]. The technique yields a heterogeneous tissue comprised of genetically distinct patches of cells that are clonally related. Aside from rare de novo mutations, cells within each clone are genetically identical. Clone formation may be restricted to specific developing organs by using disc-specific gene promoters to drive trans-chromosomal recombination events in the corresponding imaginal discs [26, 27]. The timing of these events determines the number and size of the resultant clones [28]. Perturbations are applied by engineering the dosage of a target gene to differ across clones (Fig 1B), resulting in clones whose cells are either homozygous mutant (−/−), heterozygous wildtype (+/−), or homozygous wildtype (+/+) for the particular gene. Labeling these clones with the presence or absence of fluorescent markers enables direct comparison of cells subject to control or perturbation conditions, while maintaining otherwise equivalent developmental and physiological histories between the two cell populations (Fig 2A). Additional reporters may be used to monitor differences in RNA or protein expression, morphology, or cell fate choice across clones (Fig 2B). Variants of this strategy led to seminal discoveries in both neural patterning [29–31] and morphogenesis [32, 33], and remain popular today [34–36].

Download:

Fig 2. Conventional versus quantitative mosaic analysis.

(A,B) Conventional analysis of a mosaic eye imaginal disc. (A) Clones are identified by visual comparison of clonal marker fluorescence among nuclei. (B) Regions labeled homozygous mutant (−/−) or homozygous wildtype (+/+) for the clonal marker are compared with those labeled heterozygous wildtype (+/−) to assess whether reporter expression differs across clones. Fluorescence bleed-through is arbitrarily diagnosed. (C-H) Quantitative mosaic analysis. Panels depict a magnified view of the region enclosed by red rectangles in panels A and B. (C) Raw confocal image of the nuclear stain, clonal marker, and reporter of interest. (D) Segmentation identifies distinct nuclei. (E) Reporter expression is quantified by averaging the pixel intensities within each segment. Numbers reflect measured values. (F) Measurements may be corrected to mitigate fluorescence bleedthrough. (G) Individual nuclei are labeled homozygous mutant, heterozygous, or homozygous wildtype for the clonal marker. White arrows mark nuclei with ambiguous fluorescence levels. (H) Reporter levels are compared across clones to determine whether the perturbation affects reporter expression. Yellow region marks excluded clone borders. Comparison may exclude clone borders (yellow regions) and focus on a particular region of the image field (black arrows). In the eye imaginal disc, comparison is often limited to a narrow window near the MF (orange arrow).

https://doi.org/10.1371/journal.pcbi.1007406.g002

Quantitative microscopy techniques are well suited to measuring differences in cell behavior across clones. One reporter (a clonal marker) labels the clones, while others quantitatively report properties of their constituent cells, such as the expression level of a gene product of interest (Fig 2C). The former then defines the stratification under which the latter are compared. We call this strategy Quantitative Mosaic Analysis (QMA) because it replaces subjective visual comparison with a rigorous statistical alternative. Although a few recent studies have deployed this approach [37–40], qualitative visual comparison remains pervasive in the literature.

We suspect the adoption of QMA has been hindered by demand for specialized computational skills or, in their stead, extensive manual labor. Researchers must first draw or detect boundaries around individual nuclei in a procedure known as segmentation (Fig 2D). Averaging the pixel intensities within each boundary then yields a fluorescence intensity measurement for each reporter in each identified nucleus (Fig 2E). The measurements should then be corrected to account for any fluorescence bleedthrough between reporter channels (Fig 2F). Correction often requires single-reporter calibration experiments to quantify any potential crosstalk between different fluorophores, followed by complex calculations to remedy the data [41, 42]. Researchers must then label, or annotate, each identified nucleus as mutant, heterozygous, or homozygous for the clonal marker. Annotation is typically achieved through visual inspection (Fig 2G). Cells carrying zero, one, or two copies of the clonal marker should exhibit low, medium, or high average levels of fluorescence, respectively. However, both measurement and biological noise introduce the possibility that some cells’ measured fluorescence levels may not reliably reflect their genetic identity. Annotation must therefore also consider the spatial context surrounding each nucleus. For instance, a nucleus whose neighbors express high levels of the clonal marker is likely to be homozygous for the clonal marker, even if its individual fluorescence level is comparable to that of heterozygous cells (Fig 2G, white arrows). Spatial context is particularly informative in developing tissues where cell migration is minimal, such as the fly imaginal discs. With many biological replicates containing thousands of cells each, annotation can quickly become insurmountably tedious. The corrected and labeled measurements are then curated for statistical comparison by excluding those on the border of each clone, and limiting their scope to particular regions of the image field (Fig 2H). Combined, all of these tasks ultimately burden researchers and raise the barrier for adoption of QMA.

Automation promises to alleviate this bottleneck, yet the literature bears surprisingly few computational resources designed to support QMA. The ClonalTools plugin for ImageJ deploys an image-based approach to measure macroscopic features of clone morphology, but is limited to binary classification of mutant versus non-mutant tissue and offers no functionality for comparing reporter expression across clones [43]. Alternatively, the MosaicSuite plugin for ImageJ deploys an array of image processing, segmentation, and analysis capabilities to automatically detect spatial interactions between objects found in separate fluorescence channels [44, 45]. While useful in many other settings, neither of these tools support automated labeling of individual cells or explicit comparison of clones with single-cell resolution. Most modern studies employing a quantitative mosaic analysis instead report using some form of ad hoc semi-automated pipeline built upon ImageJ [37, 39, 40]. We are therefore unaware of any platforms that offer comprehensive support for an automated QMA workflow.

Here, we introduce Fly-QMA, a computational framework for automated QMA of Drosophila imaginal discs. Fly-QMA supports segmentation, bleedthrough correction, and annotation of confocal microscopy data (Fig 2D–2H). We demonstrate each of these functions by applying them to real confocal images of clones in the eye imaginal disc, and find that our automated approach yields results consistent with manual analysis by a human expert. We then generate and use synthetic data to survey the performance of our framework across a broad range of biologically plausible conditions. Fly-QMA is freely available online (see Data and software availability), along with an interactive coding tutorial designed to acquaint users with the core software features by applying them to example data.

Results

Quantification of nuclear fluorescence levels

We implemented a segmentation strategy based upon a standard watershed approach [52]. Briefly, we construct a foreground mask by Otsu thresholding the nuclear stain or nuclear label image following a series of smoothing and contrast-limited adaptive histogram equalization operations [52, 53]. We then apply a Euclidean distance transform to the foreground mask, identify the local maxima, and use them as seeds for watershed segmentation. When applied to the microscopy data, few visible spots in the nuclear stain were neglected, and the vast majority of segments outlined individual nuclei (S1C Fig).

This approach is flexible and should perform adequately in many scenarios. However, we acknowledge that no individual strategy can address all microscopy data because segmentation is strongly context dependent. All subsequent stages of analysis were therefore designed to be compatible with any data that conform to our standardized file structure. This modular arrangement grants users the freedom to use one of the many other available segmentation platforms [54], including FlyEye Silhouette [55], before applying the remaining functionalities of our framework. Regardless of how nuclear contours are identified, averaging the pixel intensities within them yields fluorescence intensity measurements for each reporter in each identified nucleus. We next sought to ensure that these measurements were suitable for comparison across clones.

Bleedthrough correction

Despite efforts to select non-overlapping reporter bandwidths and excite them sequentially, it is not uncommon for reporters excited at one wavelength to emit some fluorescence in the spectrum collected for another channel (Fig 2B, yellow lines) [41, 56]. The end result is a positive correlation, or crosstalk, between the measured fluorescence intensities of two or more reporters. Exogenous correlations between the measured fluorescence intensities of the clonal marker and the reporter of interest are problematic given that the purpose of the experiment is to detect changes in reporter levels with respect to the clonal marker.

In our microscopy data, individual clones were distinguished by their low, medium, or high expression levels of an RFP-tagged clonal marker (Fig 3A). These images should not have shown any detectable difference in GFP levels across clones because all cells carried an equivalent dosage of the control reporter (S1A Fig). However, the images visibly suffered from bleedthrough between the RFP and GFP channels (Fig 3A and 3B). Bleedthrough was similarly evident when we compared measured GFP levels across labeled clones. Nuclei labeled mutant, heterozygous, or homozygous for the clonal marker had low, medium, and high expression levels of the control reporter, respectively (Fig 3C, black boxes). The data were therefore ripe for systematic correction.

Download:

Fig 3. Automated correction of fluorescence bleedthrough in the larval eye.

(A) Low, medium, and high expression levels of the RFP-tagged clonal marker. (B) GFP-tagged control reporter expression. RFP fluorescence bleedthrough is visually apparent upon comparison with A. (C) Comparison of control reporter expression between clones. Includes data aggregated across nine images taken from six separate eye discs. Data were limited to cells within the region of elevated GFP expression that were of approximately comparable developmental age (see S2E–S2G Fig). Measurements are stratified by their assigned labels. Before correction, expression differs between clones (black boxes, p < 10⁻⁵). No difference is detected after correction (red boxes, p > 0.05).

https://doi.org/10.1371/journal.pcbi.1007406.g003

Spectral bleedthrough correction is common practice in other forms of cross-correlation and co-localization microscopy [41, 56]. These methods typically entail characterizing the extent of crosstalk between fluorophores globally [57, 58], on a pixel-by-pixel basis [42], or by experimental calibration [41], then detrending all images or measurements prior to subsequent analysis. Our framework adopts the global approach, using the background pixels in each image to infer the extent of fluorescence bleedthrough across spectral channels.

Specifically, we assume the fluorescence intensity F_ij for channel i at pixel j is a superposition of a background intensity B_ij and some function of the expression level E_ij that we seek to compare across cells [59]: (1)

We further assume that the background intensity of a channel includes linear contributions from the fluorescence intensity of each of the other channels: (2) where k is indexed over K anticipated sources of bleedthrough. Given estimates for each {α₁, α₂, …α_K} and β we can then estimate the background intensity of each measurement: (3) where the braces denote the average across all pixels within a single nucleus. The corrected signal value is obtained by subtracting the background intensity from the measured fluorescence level: (4)

Repeating this procedure for each nucleus facilitates comparison of relative expression levels across nuclei in the absence of bleedthrough effects. Bleedthrough correction performance is therefore strongly dependent upon accurate estimation of the bleedthrough contribution strengths, {α₁, α₂, …α_K}.

We estimate these parameters by characterizing their impact on background pixels (see Methods). When applied to the microscopy data, bleedthrough correction successfully eliminated any detectable difference in GFP expression across clones (Fig 3C, red boxes, p > 0.05 two-sided Mann-Whitney U test).

Automated annotation of clones

Our annotation strategy seeks to label each identified cell as homozygous mutant, heterozygous wildtype, or homozygous wildtype for the clonal marker. Variation within each clone precludes accurate classification of a cell’s genotype solely on the basis of its individual expression level. However, in tissues where cell migration is minimal, clonal lineages are unlikely to exist in isolation because recombination events are typically timed to generate large clones. Our strategy therefore integrates both clonal marker expression and spatial context to identify clusters of cells with locally homogeneous expression behavior, then maps each cluster to one of the possible labels. This unsupervised approach lends itself to automated annotation because the clusters are inferred directly from the data without any guidance from the user.

We first train a statistical model to estimate the probability that a given measurement came from a cell carrying zero, one, or two copies of the clonal marker (S3A Fig). This entails fitting a weighted mixture of three or more bivariate lognormal distributions (components) to a two dimensional set of observations (S3B and S3C Fig). The first dimension corresponds to the clonal marker fluorescence level measured within each cell. The second dimension describes the local average expression level within the region surrounding each cell. We evaluate the latter by estimating a neighborhood radius from the decay of the radial correlation of the expression levels, then averaging the expression levels of all cells within that radius (S3D Fig). The second dimension therefore measures the spatial context in which a cell resides. We balance model fidelity against overfitting by using the Bayesian information criterion to determine the optimal number of model components (S3E Fig). We then cluster the components into three groups on the basis of their mean values (S3F Fig), effectively mapping each component to one of the three possible gene dosages. The model may be trained using observations derived from a single image, or with a collection of observations derived from multiple images. Once trained, the model is able to predict the conditional probability that an individual observation belongs to one of the model’s components, given its measured expression level.

We then use the learned conditional probabilities to detect entire clones, thus assigning a label to each cell. Rather than using the trained model to classify each observation, we compile a new set of observations by limiting each estimate of spatial context to spatially collocated communities with similar expression behavior (S4A Fig). We identify these communities by applying a community detection algorithm to an undirected graph connecting adjacent cells (S4B Fig). Edges in this graph are weighted by the similarity of clonal marker expression between neighbors, resulting in communities with similar expression levels (S4E Fig, Steps I and II). The graph-based approach increases spatial resolution by limiting the information shared by dissimilar neighbors. Applying the mixture model yields an initial estimate of the probability that an observation belongs to one of the model’s components (S4E Fig, Step III). We further refine these estimates by allowing the probabilities estimated for each cell to diffuse throughout the graph (S4E Fig, Step IV). The rate of diffusion between neighbors is determined by the weight of the edge that connects them, with more similar neighbors exerting stronger influence on each other. We then use the diffused probabilities to identify the most probable source component and label each observation (S4E Fig, Step V). These probabilities also provide a measure of confidence in the assigned labels. We replace any low-confidence labels with alternate labels assigned using a marginal classifier that neglects spatial context (S4F and S4G Fig), resulting in a fully labeled image (S4H Fig).

The algorithm leverages the collective wisdom of neighboring measurements to override spatially isolated fluctuations in clonal marker expression, and thereby enforces consistent annotation within contiguous regions of the image field. The size of these regions depends upon the granularity of estimates for the spatial context surrounding each cell. We used an unsupervised approach to choose an appropriate spatial resolution in a principled manner. In short, the resolution is matched to the approximate length scale over which expression levels remain correlated among cells. Both the training and application stages of our annotation algorithm use this automated approach (S3D and S4D Figs), thus averting any need for user input.

Manual assessment of annotation performance

We sought to validate the performance of the annotation algorithm by assessing its ability to accurately reproduce human-assigned labels. We manually labeled nuclei in each eye imaginal disc as homozygous mutant, heterozygous wildtype, or homozygous wildtype for the clonal marker, then automatically labeled the same cells (Fig 4A). The two sets of labels showed strong overall agreement (Fig 4B and S5A Fig). Excluding cells on the border of each clone revealed greater than 97% agreement in seven of the nine annotated images (see Table 1). Upon secondary inspection of the sole instance of substantial disagreement (S5B Fig), we are unable to confidently discern which set of labels are more accurate. While manual labeling required more than one hour of labor per image, the annotation algorithm achieved comparable accuracy in a matter of seconds. This performance advantage would continue to grow if the analysis were extended to multiple image layers, tissue samples, and experimental conditions.

Download:

Fig 4. Automated unsupervised annotation of clones in the larval eye.

(A) Labels assigned by automated annotation. Yellow, cyan, and magenta denote the label assigned to each contour. Labels are overlayed on the RFP channel of the image shown in S1B Fig. Cells on the periphery of each clone are excluded. (B) Comparison of automated annotation with manually-assigned labels. Confusion matrix includes data aggregated across nine images taken from six separate eye discs. Cells on the periphery of each clone are excluded. Columns sum to one.

https://doi.org/10.1371/journal.pcbi.1007406.g004

Download:

Table 1. Automated vs. manual annotation.

https://doi.org/10.1371/journal.pcbi.1007406.t001

While it is common practice to use human-labeled data as the gold standard, manually assigned labels do not represent a reliable and reproducible ground truth. Furthermore, we contend that validation with manually-labeled data entrains implicit human biases in the selection of performant algorithms. These biases are particularly pronounced in biological image data where intrinsic variation, measurement noise, and transient processes can make cell-type annotation a highly subjective, and thus irreproducible, task.

Synthetic benchmarking of annotation performance

Synthetic benchmarking provides a powerful alternative to validation against manually labeled data. The idea is simple; measure how accurately an algorithm is able to label synthetic data for which the labels are known. The synthetic data generation procedure may be modeled after the process underlying formation of the real data, providing a means to assess the performance of an algorithm across the range of conditions that it is likely to encounter. The strategy therefore provides a means to survey the breadth of biologically plausible conditions under which the algorithm provides adequate performance. Synthetic benchmarking also facilitates unbiased comparison of competing algorithms, resulting in a reliable standard that may be called upon at any time.

We used synthetic microscopy data to benchmark the performance of our annotation strategy. Each synthetic dataset depicts a simulated culture of cells distributed roughly uniformly in space (S6A Fig). Cells in this culture contain zero, one, or two copies of a gene encoding an RFP-tagged clonal marker (S6B Fig). Our simulation procedure ensures that cells tend to remain proximal to their clonal siblings (S6C Fig), thus forming synthetic clones with tunable size and spatial heterogeneity (S6D and S6E Fig). We generated synthetic measurements by randomly sampling fluorescence levels in a dosage-depend manner (S7A–S7C Fig). We varied the similarity of fluorescence levels across clones using an ambiguity parameter, σ_α, that modulates the spread of the distributions used to generate fluorescence levels (S7D–S7F Fig).

Using this schema as a template, we generated a large synthetic dataset, annotated each set of measurements, and compared the assigned labels with their true values. We used the mean absolute error as a comparison metric because it provides a stable measure of accuracy for multiclass classification problems in which the labels are intrinsically ordered [60]. In other words, it penalizes egregious misclassifications more severely than mild ones.

Annotation performance is very strong for all cases in which σ_α ≤ 0.3 (Fig 5). Unsurprisingly, performance suffers as the difficulty of the classification problem is increased. The same trends are evident when performance is graded strictly on accuracy (S8 Fig). As cells on the periphery of each clone were not excluded from these analyses, the observed metrics provide a lower bound on the performance that may be anticipated in practice.

Download:

Fig 5. Synthetic benchmarking of automated annotation performance.

Each pixel reflects the mean MAE across 50 replicates. Clone size reflects the mean number of cells per clone. Performance improves with increasing clone size and worsens with increasing fluorescence ambiguity.

https://doi.org/10.1371/journal.pcbi.1007406.g005

Performance improved with increasing clone size. We suspected this was caused by larger clones offering additional spatial context to inform the identify of each cell. We verified our assertion by re-evaluating performance relative to a variant of our annotation algorithm that neglects spatial context (S4G Fig). As expected, the variant’s performance exhibited no dependence on clone size (S9A Fig). Comparing the two strategies confirmed that spatial context confers the most benefit when clones are large (S9B Fig). Inclusion of spatial context also becomes increasingly advantageous as the fluorescence ambiguity is increased, even for smaller clones. Thus, spatial context adds progressively more value as the classification task becomes more difficult.

This observation may be rationalized from a statistical perspective. Each cell is classified by maximizing the probability that the assigned label is correct. We compute these probabilities using the estimated expression level of each cell. Neglecting spatial context, this estimate is limited to a single sample and is therefore highly sensitive to both measurement and biological noise. Incorporating spatial context expands the sample size and thereby reduces the standard error of the estimated fluorescence level. The strategy is thus generally well suited to scenarios in which fluorescence intensities correlate across large clones, and closely parallels computer vision methods that exploit spatial contiguity to segment image features with ill-defined borders [61]. Because increased measurement precision comes at the expense of spatial resolution, we expect strong performance when measurements are aggregated across relatively large clones, but failure to detect small, heterogeneous clones. These expectations are consistent with the observed results. They are also conveniently aligned with the anticipated properties of real data, as experiments typically attempt to mitigate edge effects by driving early recombination events to generate large clones.

Discussion

We used synthetic data to survey the performance of our annotation strategy across a much broader range of conditions than would have otherwise been possible with manually labeled data. This included conditions well beyond those of practical use. In particular, experiments designed to compare gene expression levels across clones would likely seek to avoid generating small clones with ambiguous clonal marker expression. Beyond complicating the annotation task, small clones are also exposed to diffusion-mediated signals from adjacent clones that can mask the effect of mutations. Cells located near the clone boundaries are often excluded for the same reason, as quantification is typically most reliable in cells surrounded by similar neighbors. Synthetic data provided a means to survey these edge cases and establish a lower bound on annotation performance. The strong performance observed across the remaining conditions bolsters our confidence that our annotation strategy is well suited to the images it is likely to encounter.

In each of our examples, clones were distinguished by ternary segregation of nuclear clonal marker fluorescence levels. Modern mosaic analysis techniques continue to deploy ternary labeling [62, 63], but also frequently opt for binary labeling of mutant versus non-mutant clones [64–66] and dichromic labeling of twin-spots [67, 68]. Our annotation scheme readily adapts to each of these scenarios provided that the number of anticipated labels is adjusted accordingly. In the case of dichromic labeling, binary classification would be performed separately for each color channel before merging the assigned labels. Extending the same logic to combinatorial pairs of colors suggests that our framework may also be compatible with multicolor labeling schemes used to simultaneously trace many clonal lineages over time [69–71]. A notable limitation of our approach is its reliance upon reporter fluorescence levels within distinct cells or nuclei. This requirement for discrete measurements precludes analysis of contiguous clones in which cytosplasmic fluorescence signals are indistinguishable between adjacent cells. Our framework is thus well suited to many different mosaic analysis platforms deployed in imaginal discs, so long as reporter fluorescence levels are measured on a discrete basis.

In principle, the framework described here should also be applicable to a wide variety of other tissues [72, 73] and model organisms [74–76] in which mosaics are studied. In practice, application to alternate contexts would require modifying some stages of the analysis. Most notably, image segmentation is strongly context dependent and any attempts to develop a universally successful strategy are likely to prove futile [77]. For this reason, we implemented a modular design in which each stage of analysis may be applied separately. For example, a user could perform their own segmentation before using our bleedthrough correction and clone annotation tools. By offering modular functionalities we hope to extend the utility of our software to the wider community of developmental biologists. Furthermore, the open-source nature of our framework supports continued development of more advanced features as various demands arise. Our synthetic benchmarking platform could then be used to objectively confirm the benefit conferred by any future developments.

Materials and methods

Genetics and microscopy of Drosophila eye imaginal discs

We borrowed an experimental dataset from a separate study of neuronal fate commitment during eye disc development [38]. The data consist of six eye imaginal discs dissected and fixed during the third larval instar of Drosophila development. Within each disc, ey>FLP and FRT40A were used to generate clones. The chromosome arm (2L) targeted for recombination was marked with a Ubi-mRFPnls transgene (S1A Fig), enabling automated detection of clones marked by distinct levels of mRFP fluorescence (S1B Fig). The discs also carried a pnt-GFP reporter transgene located on a different chromosome that was not subject to mitotic recombination. The PntGFP reporter is predominantly expressed in two narrow stripes of progenitor cells during eye disc development [38]. The first stripe occurs immediately posterior to a wave of developmental signaling that traverses the eye disc. Progenitor cells located in this region are suitable for comparison because they are of approximately equivalent developmental age. We applied the Fly-QMA framework to a total of nine images of these cells.

Genetics, fly lines, immunohistochemistry, and imaging conditions related to this dataset have already been published [38]. All discs were dissected in PBS, fixed in 4% paraformaldehyde for 30 min at room temperature, and permeabilized with PBS-Triton X-100 0.1% for 20 min at room temperature to allow DAPI penetration without perturbing the fluorescence of the Pnt-GFP protein. Discs were subsequently stained with a 4’,6-diamidino-2-phenylindole (DAPI) nuclear marker, rinsed twice with PBS-Tween 0.5%, and mounted on Vecta Shield (Vector labs). Images were acquired using a Leica SP5 confocal equipped with a tunable detector. The 405, 488, and 561 nm lasers were used to excite DAPI, Pnt-GFP, and Ubi-mRFPnls, while photons were collected in the 437-481, 491–555, and 570-644 nm intervals for DAPI, GFP, and mRFP, respectively. Images were recorded with 16-bit resolution using a 40X oil objective. Discs were oriented with the dorso-ventral equator parallel to the horizontal axis, and all images captured at least six rows of ommatidia on either side of the equator. All discs were fixed, mounted, and imaged in parallel in order to reduce measurement error.

Characterization of fluorescence bleedthrough

For each image, we morphologically dilate the foreground until no features remain visible (S2A Fig). We then extract the background pixels and resample them such that the distribution of pixel intensities is approximately uniform (S2B Fig). Resampling helps mitigate the skewed distribution of pixel intensities found in the background. We then estimate values for each {α₁, α₂, …α_K} and β by fitting a generalized linear model to the fluorescence intensities of the resampled pixels (S2C Fig). Each model is a variant of Eq 3 in which angled braces instead denote averages across all background pixels. We formulate these models with identity link functions under the assumption that residuals are gamma distributed. Their coefficients provide an estimate of the bleedthrough contribution strengths that may then be used to estimate the background fluorescence intensity of each nucleus in the corresponding image (S2D Fig). The measurements may then be corrected through application of Eq 4.

Clone annotation algorithm

We assume the measured fluorescence level x_i for cell i is sampled from an underlying distribution p_m(x) for cells carrying m copies of the gene encoding the clonal marker: (5)

We further assume that p_m(x) is comprised of a mixture of one or more lognormal distributions: (6) (7) where 0 ≤ λ ≤ 1 are the mixing proportions, are the mean and variance of the nth distribution. This assumption is supported by both empirical observations and theoretical insights [46, 47]. By superposition, the global distribution of measured fluorescence levels p(lnx) for all values of m are also sampled from a mixture of K components: (8) (9) where α_m denotes the overall fraction of cells with m copies of the gene encoding the clonal marker. For brevity, we substitute X = lnx yielding: (10)

Given a collection of sampled fluorescence levels, {X_i}_{i = 1…N}, we use expectation maximization to find values of θ_k and λ_k for each of the model’s K components that maximize the log-likelihood of the observed sample. We repeat this procedure for a range of sequential values of K, resulting in multiple models of increasing size. We then balance model resolution against overfitting by selecting the model that yields the smallest value of the Bayesian Information Criterion (BIC): (11) (12) where N is the sample size, is the maximum value of the log-likelihood, the subscript K denotes the number of mixture components in the model, and q_K is the total number of parameters (i.e. K − 1 values of λ_k and 2^K values of μ_k and ).

Applying Bayes’ rule to the selected model infers the posterior probabilities that each sample X_i belongs to the kth component: (13) where p(X_i∣k) is evaluated using the model’s likelihood function and p(X_i) is evaluated by marginalizing across each of the model’s K components. The end result is a mixture model that allows us to predict the probability that a given measurement of clonal marker expression belongs to a particular one of its component distributions.

We then define a many-to-one mapping, f, from each of the K components of the mixture to each of the three possible values of m: (14)

We determine the mapping by k-means clustering the K component distributions into three groups on the basis of their mean values, . We may then assign a genotype label m to each measurement X_i by predicting the component k from which it was sampled.

The accuracy of these labels depends upon how closely the fitted mixture model reflects the true partitioning of gene copies among clones. While finite mixtures are always identifiable given a sufficiently large sample [48], the algorithm used to fit the mixture tends toward local maxima of the likelihood function when the true components are similar (Wu, 1983). An approach based on a univariate mixture is thus inherently prone to failure when expression levels extensively overlap across clones, as variation within each clone precludes accurate classification of a cell’s genotype solely on the basis of its individual expression level. However, clonal lineages are unlikely to exist in isolation because recombination events are usually timed to generate large clones. Our strategy therefore integrates both clonal marker expression and spatial context to identify clusters of cells with locally homogeneous expression behavior.

We incorporate spatial context by introducing a second jointly-distributed variable Y_i: (15) where the subscript j indexes all M_i neighbors of cell i. The new variable reflects the average expression level among the neighbors surrounding each cell. We define neighbors as pairs of cells located within a critical distance of each other. This distance, or sampling radius, is derived from the approximate length scale over which cells retain approximately similar clonal marker expression levels. Specifically, we determine the exponential decay constant of the spatial correlation function, ψ(δ): (16) where μ_X and are the global mean and standard deviation, and angled brackets denote the mean across all pairs of cells separated by distance δ. We efficiently implement this procedure by fitting an exponential decay function to the down-sampled moving average of ψ(δ) as a function of increasing separation distance.

Following the introduction of spatial context, the mixture model becomes: (17) where contains the mean and variance of each component given by vectors of length two. This formulation constrains each component’s covariance matrix to be diagonal. The posterior is now: (18)

We can recover the univariate model by marginalizing the posterior over all values of Y: (19)

When neglecting spatial context, we use this expression to classify each sample by applying the mapping f to the value of k that maximizes p(k∣X_i): (20)

In all other cases, we deploy a graph-based approach to refine the estimate of p(k∣X_i, Y_i). This first entails constructing an undirected graph connecting adjacent cells within each image. We obtain the graph’s edges through Delaunay triangulation of the measured cell positions, then exclude distant neighbors by thresholding the edge lengths. Each edge is assigned a weight w_ij reflecting the similarity of clonal marker expression between adjacent cells i and j: (21) (22) where E_ij is the absolute log fold-change in measured expression level and angled brackets denote the mean across all edges. We chose an exponential formulation because it yields an approximately uniform distribution of edge weights. We then detect communities within the graph using the Infomap algorithm [49]. The algorithm provides a hierarchical partitioning of nodes into non-overlapping clusters. We aggregate all clusters below a critical level that is again chosen by estimating the spatial correlation decay constant. We then enumerate where is the spatial context obtained by averaging expression levels among all neighbors in the same community as cell i.

We further incorporate spatial context by allowing the posterior probabilities to diffuse among adjacent cells. We define the modified posterior probability through a recursive relation analogous to the Katz centrality [50], initialized by : (23) (24) where α is the attenuation factor and w_ij are the edge weights. Expressed in matrix form, the solution for is given by: (25) where I denotes the identity matrix and W is the matrix of edge weights w_ij. We then assign a label to each measurement X_i by applying f to the value of k that maximizes : (26)

Finally, we assess the total posterior probability of each assigned label, : (27)

This measure reflects the overall confidence that m_i is the appropriate label. Labels whose confidence falls below 80% are replaced by their counterparts estimated using the marginal classifier. This substitution helps preserve classification accuracy in situations where spatial context is not informative, and is particularly useful when the annotated clones are relatively small.

Statistical comparison of fluorescence levels

To mitigate edge effects, cells residing on the periphery of each clone were excluded from all comparisons (S2E Fig). Border cells were identified by using a Delaunay triangulation to find all cells connected to a neighbor within a different clone. Our framework includes a simple graphical user interface that permits manual curation of which regions of the image field are included in subsequent analyses. We used this tool to limit our analysis to the region of elevated GFP expression near the morphogenetic furrow (S2F Fig). Comparisons were further restricted to cells undergoing similar stages of development (S2G Fig). These restrictions served to buffer against differences in developmental context and ensured that all compared cells were of similar developmental age. The remaining fluorescence measurements were then aggregated across all eye discs and compared between pairs of clones by two-sided Mann-Whitney U test.

Simulated cell growth and recombination

We simulated the two dimensional growth of a cell culture seeded with a single cell. Growth proceeds through sequential division of cells (S6A Fig). Not all cells divide at each time-step because cell division is a stochastic process. Instead, each cell divides stochastically with a rate controlled by a global growth rate parameter.

Cells in this culture carry a gene encoding a clonal marker (S6B Fig). During growth, the gene is subject to mitotic recombination (S6C Fig). Each time a cell divides, its genes are duplicated and equally partitioned between the two daughter cells. However, in some instances a heterozygous parent may instead partition its two duplicate genes unequally, with one daughter receiving both and the other receiving none. These mitotic recombination events occur stochastically with a frequency defined by a global recombination rate parameter.

After each round of cell division, all cells are repositioned in order to preserve approximately uniform spatial density (S6C Fig). Repositioning is achieved by equilibrating a network of springs connecting each cell with its neighbors. This undirected network is constructed through Delaunay triangulation of all cells spatial positions. Edges on the periphery of the culture are systematically excluded by establishing a maximum polar angle between neighbors. This filtration removes spurious edges between distant pairs of cells. Edges connecting pairs of cells with the same clonal marker dosage are assigned a 10% higher spring constant than edges that connect dissimilar cells. This modest bias ensures that cells tend to remain proximal to their clonal lineages. Cell positions are then updated using a force-directed graph drawing algorithm [51]. Alternating cell division and repositioning steps are then repeated until a predefined population size is reached.

The timing and duration of recombination events affects the number and size of the resultant clones. In real experiments, recombination events are restricted to a particular stage of the developmental program through localized exogenous expression of the recombination machinery. We incorporated this feature into our cell growth simulations via two adjustable parameters. The first determines the minimum population size at which recombination may begin, while the second determines the number of generations over which recombination may continue to occur. These two parameters provide a means to tune the average number and size of clonal subpopulations in the synthetic data (S6D Fig). Early recombination events generally entail larger clones, while shorter recombination periods limit the extent of clone formation (S6E Fig).

Generation of synthetic microscopy data

Each simulation yields a list of spatial coordinates and gene dosages for each nucleus (S6B Fig). Synthetic measurements for each nucleus were generated by randomly sampling fluorescence levels {x₁, x₂, …x_{i = N}} from a lognormal distribution conditioned upon the corresponding gene dosage (S7A–S7C Fig): (28) where the subscript n denotes the gene copy number and are the mean and variance of the corresponding distribution. We define μ_n such that the mean fluorescence level doubles for each additional copy of the gene: (29)

We refer to σ_α as the fluorescence ambiguity because it modulates the similarity of fluorescence levels across gene dosages. Increasing σ_α increases the overlap among , , and (S7D and S7E Fig), and consequently increases the difficulty of the annotation task (S7F Fig).

Synthetic benchmarking of annotation performance

We generated a large synthetic dataset spanning a broad range of sixteen different clone sizes and fluorescence ambiguities (S6D and S7F Figs, only half are shown). We performed 50 replicate simulations for each condition. All simulations were terminated when the total population exceeded 2048 cells. We assigned each cell a 20% probability of division upon each iteration, and each cell division event was accompanied by a 20% chance of mitotic recombination. Parent cells containing zero or two copies of the recombined genes were ineligible for recombination, effectively sealing the genetic fates of their respective lineages.

To annotate each set of measurements, the mixture model given by Eq 17 was independently trained and applied to each replicate. Training a single model on all replicates yields modestly stronger performance on average, but also yields more variable variable results across the parameter space because all labels are dependent upon the outcome of a single expectation maximization routine.

Data and software availability

We have distributed the automated mosaic analysis framework as an open-source python package available at https://sebastianbernasek.github.io/flyqma. The associated code repository contains resources designed to help users analyze their own microscope images. These include code documentation, a guide to getting started with Fly-QMA, and an interactive tutorial that uses example data to demonstrate the core features of the software. We also intend to incorporate Fly-QMA into future versions of FlyEye Silhouette, our open-source desktop application for quantitative analysis of the larval eye. The code used to generate synthetic microscopy data is also freely available at https://github.com/sebastianbernasek/growth. All segmented and annotated eye discs are accessible via our data repository (https://doi.org/10.21985/N2F207).

Supporting information

S1 Fig. Example clones in the larval fly eye.

(A) Genetic schema for a bleedthrough control experiment. Red and green ovals represent genes encoding a RFP-tagged clonal marker and a GFP-tagged control reporter, respectively. Black lines depict a genomic locus. Recombination does not affect gene dosage of the control reporter, so GFP variation across clones is attributed to fluorescence bleedthrough. (B) Confocal image of an eye imaginal disc. Red, green, and blue reflect clonal marker, control reporter, and nuclear stain fluorescence, respectively. (C) Segmentation of the DAPI nuclear stain. White lines show individual segments.

https://doi.org/10.1371/journal.pcbi.1007406.s001

(TIF)

S2 Fig. Using background pixels to characterize bleedthrough contributions in the foreground.

(A) Extraction of background pixels (striped region). Foreground includes the merged RFP and GFP images, surrounded by a white line. White arrow marks the morphogenetic furrow (MF). (B) Background pixel values are resampled such that RFP intensities are uniformly distributed. (C) A generalized linear model characterizes the contribution of RFP bleedthrough to GFP fluorescence. Boxes reflect windowed distributions of resampled background pixel intensities. Red line shows the model fit. (D) Measured GFP levels before bleedthrough correction. Markers represent individual nuclei. Red line shows the inferred contributions of RFP fluorescence bleedthrough. Dashed portion is extrapolated. (E-G) Data curation prior to statistical comparison of GFP levels. (E) Cells on the periphery of each clone are excluded. (F) The selection is limited to the region of elevated GFP expression near the MF. (G) It is further limited to cells of the same developmental age, defined by their relative positions along the x-axis.

https://doi.org/10.1371/journal.pcbi.1007406.s002

(TIF)

S3 Fig. Training a clone annotation model.

(A) One or more images are segmented, yielding a set of fluorescence measurements X. These are used to sample the spatial context Y of the neighborhood surrounding each cell. Both sets of values are used to train a mixture model. Subsequent panels demonstrate these procedures using the example shown in S3 Fig C. (B) Expression levels are jointly distributed with the local average among neighboring cells. Center panel shows the joint distribution. Top and right bar plots show marginal distributions. (C) Mixture model identifies seven distinct components k_i. Center panel shows position and spread of each component. Top and right panels show marginal components scaled by their respective weights. Red shading denotes the label m_i assigned to each component. The model predicts the posterior probabilities that a given sample (X, Y) belongs to each component. (D) Neighborhood size is estimated by computing the decay constant of the spatial correlation function, ψ(δ). Black line shows the moving average of ψ(δ), red line shows an exponential fit. Inset shows the resultant sampling region. (E) The optimal number of mixture components is determined by minimizing BIC score. (F) Mixture components are labeled by k-means clustering their mean values. Markers reflect the component means, colors denote the assigned label.

https://doi.org/10.1371/journal.pcbi.1007406.s003

(TIF)

S4 Fig. Label assignment using a trained clone annotation model.

(A) Measurements are used to sample spatial contexts before the trained model is applied (blue and green path). In parallel, measurements are labeled using a marginal projection of the trained model (magenta path). The labels are then merged (red path). (B-D) Spatial context sampling. (B) Weighted undirected graph connecting adjacent cells. Line width reflects expression similarity between neighbors. (C) Community resolution is defined by aggregating clusters that fall below a hierarchical cut level δ. Panels show increasing levels of aggregation. Colors denote distinct communities. (D) Cut level is chosen by finding the maximum level (red dot) that remains lower than the decay constant of the spatial correlation function, ψ(δ) (black line). Panel E depicts aggregation below the third level for ease of visualization. (E) Application of the mixture model. (I) The graph contains distinct communities of locally similar expression. (II) Mean expression level within each community serves as the local average for each cell. (III) Mixture model estimates the probability that each cell belongs to each of its component. Bar plots within each cell illustrate the cumulative probability of each label. (IV) Posterior probabilities are diffused across the graph. (V) Each cell is assigned the most probable label. (F,G) Application of a marginal mixture model. (F) Marginal mixture components, shaded by their mapped labels. Dashed line is the overall marginal density. (G) Marginal classifier labels cells strictly on the basis of their individual fluorescence level. Red shading denotes the most probable label for each level. (H) Annotated measurements. Red shading denotes the assigned label. Labels with low confidence are replaced by their marginal counterparts.

https://doi.org/10.1371/journal.pcbi.1007406.s004

(TIF)

S5 Fig. Comparison of automated annotation with manually assigned labels.

(A) Distribution of labels among each possible value. (B) Visual comparison of the sole instance in which automated and manual annotation differ. Image shows clonal marker fluorescence, colors denote the assigned label.

https://doi.org/10.1371/journal.pcbi.1007406.s005

(TIF)

S6 Fig. Simulated growth of a synthetic cell culture.

(A) Partial simulation time course. Each marker depicts a cell. Greyscale intensity reflects clonal marker gene dosage. Simulation time reflects the approximate number of cell divisions since the initial seed. (B) Simulations yield gene dosages and spatial coordinates for each cell. (C) Single iteration of an example simulation. Circles represent individual cells, red shading denotes clonal marker dosage. Cycles of cell division, recombination, and repositioning are repeated until the simulation reaches a specified end time (t > 11 in panel A). (D) Cultures simulated with varying recombination start times. All cultures were subject to four generations of recombination (δt = 4). Recombination start time increases from left to right. Later recombination events generally yield smaller clones. (E) Mean clone size (cells per clone) as a function of the recombination start time. Colors denote recombination period duration. Error bars reflect standard error of the mean across 50 replicates. Clone size generally decreases as recombination is limited to later times.

https://doi.org/10.1371/journal.pcbi.1007406.s006

(TIF)

S7 Fig. Tunable generation of synthetic microscopy data.

(A) Fluorescence levels are sampled from lognormal distributions conditioned upon gene dosage. (B) Synthetic data include a measured fluorescence level for each reporter in each cell. Text color reflects the generative distribution in A. (C) Synthetic image of clonal marker fluorescence when σ_α = 0.25. Each nucleus is shaded in accordance with its sampled fluorescence intensity. (D-F) Left to right, increasing the fluorescence ambiguity parameter broadens the overlap in fluorescence levels across gene dosages. (D) Distributions used to generate clonal marker fluorescence levels. Red shading denotes gene dosage. (E) Evenly weighted sum of the generative distributions. (F) Example images of clonal marker fluorescence.

https://doi.org/10.1371/journal.pcbi.1007406.s007

(TIF)

S8 Fig. Fraction of nuclei correctly labeled during synthetic benchmarking.

Each pixel reflects the average across 50 replicates. Clone size reflects the mean number of cells per clone. Performance improves with increasing clone size and worsens with increasing fluorescence ambiguity.

https://doi.org/10.1371/journal.pcbi.1007406.s008

(TIF)

S9 Fig. Spatial context is most informative for large clones with ambiguous fluorescence.

(A) MAE of labels assigned using a marginal classifier that neglects spatial context. Performance worsens with increasing fluorescence ambiguity but does not depend upon clone size. (B) Annotation performance relative to the marginal classifier. Color scale reflects the log₂ fold-change in MAE when spatial context is neglected. Blue indicates that spatial context improves performance.

https://doi.org/10.1371/journal.pcbi.1007406.s009

(TIF)

References

1. Oates AC, Gorfinkiel N, González-Gaitán M, Heisenberg CP. Quantitative approaches in developmental biology; 2009. Available from: http://www.nature.com/articles/nrg2548.
2. Muzzey D, van Oudenaarden A. Quantitative time-lapse fluorescence microscopy in single cells. Annual Review of Cell and Developmental Biology. 2009;25(1):301–327. pmid:19575655
- View Article
- PubMed/NCBI
- Google Scholar
3. Stelzer EHK. Light-sheet fluorescence microscopy for quantitative biology. Nature Methods. 2014;12(1):23–26.
- View Article
- Google Scholar
4. Truong TV, Supatto W. Toward high-content/high-throughput imaging and analysis of embryonic morphogenesis. Genesis. 2011;49(7):555–569. pmid:21504047
- View Article
- PubMed/NCBI
- Google Scholar
5. Aghaeepour N, Finak G, Hoos H, Mosmann TR, Brinkman R, Gottardo R, et al. Critical assessment of automated flow cytometry data analysis techniques. Nature Methods. 2013;10(3):228–238. pmid:23396282
- View Article
- PubMed/NCBI
- Google Scholar
6. Chen X, Hasan M, Libri V, Urrutia A, Beitz B, Rouilly V, et al. Automated flow cytometric analysis across large numbers of samples and cell types. Clinical Immunology. 2015;157(2):249–260. pmid:25576660
- View Article
- PubMed/NCBI
- Google Scholar
7. Pyne S, Maier LM, Lin TI, Wang K, Rossin E, Hu X, et al. Automated high-dimensional flow cytometric data analysis. Proceedings of the National Academy of Sciences. 2009;106(21):8519–8524.
- View Article
- Google Scholar
8. Bernstein BE, Brown M, Johnson DS, Liu XS, Nussbaum C, Myers RM, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biology. 2008;9(9):R137. pmid:18798982
- View Article
- PubMed/NCBI
- Google Scholar
9. Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biology. 2007;8(2). pmid:17291332
- View Article
- PubMed/NCBI
- Google Scholar
10. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012;9(4):357–9. pmid:22388286
- View Article
- PubMed/NCBI
- Google Scholar
11. Trapnell C, Pachter L, Salzberg SL. TopHat: Discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–1111. pmid:19289445
- View Article
- PubMed/NCBI
- Google Scholar
12. Costes SV, Daelemans D, Cho EH, Dobbin Z, Pavlakis G, Lockett S. Automatic and quantitative measurement of protein-protein colocalization in live cells. Biophysical Journal. 2004;86(6):3993–4003. pmid:15189895
- View Article
- PubMed/NCBI
- Google Scholar
13. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols. 2015;10(6):845–858. pmid:25950237
- View Article
- PubMed/NCBI
- Google Scholar
14. Carpenter AE, Jones TR, Lamprecht MR, Clarke C, Kang IH, Friman O, et al. CellProfiler: Image analysis software for identifying and quantifying cell phenotypes. Genome Biology. 2006;7(10):R100. pmid:17076895
- View Article
- PubMed/NCBI
- Google Scholar
15. Paintdakhi A, Parry B, Campos M, Irnov I, Elf J, Surovtsev I, et al. Oufti: An integrated software package for high-accuracy, high-throughput quantitative microscopy analysis. Molecular Microbiology. 2016;99(4):767–777. pmid:26538279
- View Article
- PubMed/NCBI
- Google Scholar
16. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. Fiji: An open-source platform for biological-image analysis; 2012. Available from: http://www.nature.com/articles/nmeth.2019.
17. Sommer C, Straehle C, Kothe U, Hamprecht FA. Ilastik: Interactive learning and segmentation toolkit. In: Proceedings—IEEE International Symposium on Biomedical Imaging. 2011. p. 230–233. Available from: http://ieeexplore.ieee.org/document/5872394/.
18. Jug F, Pietzsch T, Preibisch S, Tomancak P. Bioimage informatics in the context of Drosophila research. Methods. 2014;68(1):60–73. pmid:24732429
- View Article
- PubMed/NCBI
- Google Scholar
19. Sbalzarini IF. Seeing is believing: Quantifying is convincing: Computational image analysis in biology. Advances in Anatomy, Embryology, and Cell Biology. 2016;219:1–39. pmid:27207361
- View Article
- PubMed/NCBI
- Google Scholar
20. Schindelin J, Rueden CT, Hiner MC, Eliceiri KW. The ImageJ ecosystem: An open platform for biomedical image analysis; 2015. Available from: http://doi.wiley.com/10.1002/mrd.22489.
21. Simpson IT, Price DJ. Pax6; a pleiotropic player in development; 2002. Available from: http://doi.wiley.com/10.1002/bies.10174.
22. Parody TR, Muskavitch MAT. The pleiotropic function of Delta during postembryonic development of Drosophila melanogaster. Genetics. 1993;135(2):527–539. pmid:8244012
- View Article
- PubMed/NCBI
- Google Scholar
23. Shilo BZ, Raz E. Developmental control by the Drosophila EGF receptor homolog DER; 1991. Available from: https://www.sciencedirect.com/science/article/pii/016895259190261N.
24. Xu T, Rubin GM. Analysis of genetic mosaics in developing and adult Drosophila tissues. Development. 1993;117(4):1223–37. pmid:8404527
- View Article
- PubMed/NCBI
- Google Scholar
25. Xu T, Rubin GM. The effort to make mosaic analysis a household tool. Development. 2012;139(24):4501–4503. pmid:23172911
- View Article
- PubMed/NCBI
- Google Scholar
26. Newsome TP, Asling B, Dickson BJ. Analysis of Drosophila photoreceptor axon guidance in eye-specific mosaics. Development. 2000;127(4):851–60. pmid:10648243
- View Article
- PubMed/NCBI
- Google Scholar
27. Theodosiou NA, Xu T. Use of FLP/FRT system to study Drosophila development. Methods. 1998;14(4):355–365. pmid:9608507
- View Article
- PubMed/NCBI
- Google Scholar
28. Struhl G, Basler K. Organizing activity of wingless protein in Drosophila. Cell. 1993;.
- View Article
- Google Scholar
29. Halfar K, Rommel C, Stocker H, Hafen E. Ras controls growth, survival and differentiation in the Drosophila eye by different thresholds of MAP kinase activity. Development. 2001;128(9):1687–96. pmid:11290305
- View Article
- PubMed/NCBI
- Google Scholar
30. Tomlinson A, Struhl G. Delta/Notch and Boss/Sevenless signals act combinatorially to specify the Drosophila R7 photoreceptor. Molecular Cell. 2001;7(3):487–95. pmid:11463374
- View Article
- PubMed/NCBI
- Google Scholar
31. Yang L, Baker NE. Role of the EGFR/Ras/Raf pathway in specification of photoreceptor cells in the Drosophila retina. Development. 2001;128(7):1183–91. pmid:11245584
- View Article
- PubMed/NCBI
- Google Scholar
32. Huang J, Wu S, Barrera J, Matthews K, Pan D. The Hippo signaling pathway coordinately regulates cell proliferation and apoptosis by inactivating Yorkie, the Drosophila homolog of YAP. Cell. 2005;122(3):421–434. pmid:16096061
- View Article
- PubMed/NCBI
- Google Scholar
33. Thompson BJ, Cohen SM. The Hippo pathway regulates the bantam microRNA to control cell proliferation and apoptosis in Drosophila. Cell. 2006;126(4):767–774. pmid:16923395
- View Article
- PubMed/NCBI
- Google Scholar
34. Atkins M. Drosophila genetics: The power of genetic mosaic approaches. In: Methods Mol. Biol. vol. 1893. Humana Press, New York, NY; 2019. p. 27–42. Available from: http://link.springer.com/10.1007/978-1-4939-8910-2_2.
35. Enomoto M, Siow C, Igaki T. Drosophila as a cancer model. In: Advances in Experimental Medicine and Biology. vol. 1076. Springer, Singapore; 2018. p. 173–194. Available from: http://link.springer.com/10.1007/978-981-13-0529-0_10.
36. Germani F, Bergantinos C, Johnston LA. Mosaic analysis in Drosophila. Genetics. 2018;208(2):473–490. pmid:29378809
- View Article
- PubMed/NCBI
- Google Scholar
37. Dai W, Peterson A, Kenney T, Burrous H, Montell DJ. Quantitative microscopy of the Drosophila ovary shows multiple niche signals specify progenitor cell fate. Nature Communications. 2017;8(1):1244. pmid:29093440
- View Article
- PubMed/NCBI
- Google Scholar
38. Bernasek SM, Lachance JFB, Peláez N, Bakker R, Navarro HT, Amaral LAN, et al. Ratio-based sensing of two transcription factors regulates the transit to differentiation. bioRxiv. 2018; p. 430744.
39. Ghiglione C, Jouandin P, Cérézo D, Noselli S. The Drosophila insulin pathway controls Profilin expression and dynamic actin-rich protrusions during collective cell migration. Development. 2018;145(14):dev161117. pmid:29980565
- View Article
- PubMed/NCBI
- Google Scholar
40. Li K, Baker NE. Regulation of the Drosophila ID protein Extra macrochaetae by proneural dimerization partners. Elife. 2018;7.
- View Article
- Google Scholar
41. Bacia K, Petrášek Z, Schwille P. Correcting for spectral cross-talk in dual-color fluorescence cross-correlation spectroscopy. ChemPhysChem. 2012;13(5):1221–1231. pmid:22344749
- View Article
- PubMed/NCBI
- Google Scholar
42. Elangovan M, Wallrabe H, Chen Y, Day RN, Barroso M, Periasamy A. Characterization of one- and two-photon excitation fluorescence resonance energy transfer microscopy. Methods. 2003;29(1):58–73. pmid:12543072
- View Article
- PubMed/NCBI
- Google Scholar
43. Mort RL. Quantitative analysis of patch patterns in mosaic tissues with ClonalTools software. Journal of Anatomy. 2009;215(6):698–704. pmid:19840025
- View Article
- PubMed/NCBI
- Google Scholar
44. Helmuth JA, Paul G, Sbalzarini IF. Beyond co-localization: Inferring spatial interactions between sub-cellular structures from microscopy images. BMC Bioinformatics. 2010;11(1):372. pmid:20609242
- View Article
- PubMed/NCBI
- Google Scholar
45. Shivanandan A, Radenovic A, Sbalzarini IF. MosaicIA: An ImageJ/Fiji plugin for spatial pattern and interaction analysis. BMC Bioinformatics. 2013;14(1):349. pmid:24299066
- View Article
- PubMed/NCBI
- Google Scholar
46. Furusawa C, Suzuki T, Kashiwagi A, Yomo T, Kaneko K. Ubiquity of log-normal distributions in intra-cellular reaction dynamics. Biophysics. 2005;1:25–31. pmid:27857550
- View Article
- PubMed/NCBI
- Google Scholar
47. Beal J. Biochemical complexity drives log-normal variation in genetic expression. Engineering Biology. 2017;1(1):55–60.
- View Article
- Google Scholar
48. Teicher H. Identifiability of finite mixtures. The Annals of Mathematical Statistics. 1963;34(4):1265–1269.
- View Article
- Google Scholar
49. Rosvall M, Axelsson D, Bergstrom CT. The map equation. European Physical Journal. 2009;.
- View Article
- Google Scholar
50. Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;.
- View Article
- Google Scholar
51. Kamada T, Kawai S. An algorithm for drawing general undirected graphs. Information Processing Letters. 1989;31(1):7–15.
- View Article
- Google Scholar
52. van der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, et al. scikit-image: image processing in Python. PeerJ. 2014;. pmid:25024921
- View Article
- PubMed/NCBI
- Google Scholar
53. Nobuyuki Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics. 1979;.
54. Bugarski M, Mansouri M, Niemann A, Rizk A, Berger P, Ziegler U, et al. Segmentation and quantification of subcellular structures in fluorescence microscopy images using Squassh. Nature Protocols. 2014;9(3):586–596. pmid:24525752
- View Article
- PubMed/NCBI
- Google Scholar
55. Peláez N, Gavalda-Miralles A, Wang B, Navarro HT, Gudjonson H, Rebay I, et al. Dynamics and heterogeneity of a fate determinant during transition towards cell differentiation. Elife. 2015;4. pmid:26583752
- View Article
- PubMed/NCBI
- Google Scholar
56. Zinchuk V, Zinchuk O, Okada T. Quantitative colocalization analysis of multicolor confocal immunofluorescence microscopy images: Pushing pixels to explore biological phenomena. Acta Histochemica et Cytochemica. 2007;40(4):101–111. pmid:17898874
- View Article
- PubMed/NCBI
- Google Scholar
57. Arsenovic PT, Mayer CR, Conway DE. SensorFRET: A standardless approach to measuring pixel-based spectral bleed-through and FRET efficiency using spectral imaging. Scientific Reports. 2017;7(1). pmid:29142199
- View Article
- PubMed/NCBI
- Google Scholar
58. Kim D, Curthoys NM, Parent MT, Hess ST. Bleed-through correction for rendering and correlation analysis in multi-colour localization microscopy. Journal of Optics. 2013;15(9). pmid:26185614
- View Article
- PubMed/NCBI
- Google Scholar
59. McMullen PD, Morimoto RI, Amaral LAN. Physically grounded approach for estimating gene expression from microarray data. Proceedings of the National Academy of Sciences. 2010;107(31):13690–13695.
- View Article
- Google Scholar
60. Gaudette L, Japkowicz N. Evaluation methods for ordinal classification. In: Lecture Notes in Computer Science. vol. 5549 LNAI. Springer, Berlin, Heidelberg; 2009. p. 207–210. Available from: http://link.springer.com/10.1007/978-3-642-01818-3_25.
61. Nguyen TM, Wu QMJ. Gaussian mixture-model-based spatial neighborhood relationships for pixel labeling problems. IEEE Transactions on Systems, Man, and Cybernetics. 2012;42(1):193–202. pmid:21846606
- View Article
- PubMed/NCBI
- Google Scholar
62. Gambis A, Dourlen P, Steller H, Mollereau B. Two-color in vivo imaging of photoreceptor apoptosis and development in Drosophila. Developmental Biology. 2011;351(1):128–134. pmid:21215264
- View Article
- PubMed/NCBI
- Google Scholar
63. Dourlen P, Levet C, Mejat A, Gambis A, Mollereau B. The Tomato/GFP-FLP/FRT method for live imaging of mosaic adult Drosophila photoreceptor cells. Journal of Visualized Experiments. 2013;79.
- View Article
- Google Scholar
64. Fisher YE, Yang HH, Isaacman-Beck J, Xie M, Gohl DM, Clandinin TR. FlpStop, a tool for conditional gene control in Drosophila. Elife. 2017;6.
- View Article
- Google Scholar
65. Wu JS, Luo L. A protocol for mosaic analysis with a repressible cell marker (MARCM) in Drosophila. Nature Protocols. 2007;1(6):2583–2589.
- View Article
- Google Scholar
66. Zhou Q, Neal SJ, Pignoni F. Mutant analysis by rescue gene excision: New tools for mosaic studies in Drosophila. Genesis. 2016;54(11):589–592. pmid:27696669
- View Article
- PubMed/NCBI
- Google Scholar
67. Heffern E, Perrimon N, Hohl AM, del Valle Rodriguez A, Bakal C, Bonvin M, et al. The twin spot generator for differential Drosophila lineage analysis. Nat Methods. 2009;6(8):600–602. pmid:19633664
- View Article
- PubMed/NCBI
- Google Scholar
68. Yu HH, Kao CF, He Y, Ding P, Kao JC, Lee T. A complete developmental sequence of a Drosophila neuronal lineage as revealed by twin-spot MARCM. PLoS Biology. 2010;8(8):39–40.
- View Article
- Google Scholar
69. Denes AS, Caussinus E, Affolter M, Kanca O, Percival-Smith A. Raeppli: a whole-tissue labeling tool for live imaging of Drosophila development. Development. 2013;141(2):472–480. pmid:24335257
- View Article
- PubMed/NCBI
- Google Scholar
70. Hadjieconomou D, Rotkopf S, Alexandre C, Bell DM, Dickson BJ, Salecker I. Flybow: Genetic multicolor cell labeling for neural circuit analysis in Drosophila melanogaster. Nature Methods. 2011;8(3):260–266. pmid:21297619
- View Article
- PubMed/NCBI
- Google Scholar
71. Hampel S, Chung P, McKellar CE, Hall D, Looger LL, Simpson JH. Drosophila Brainbow: a recombinase-based fluorescence labeling technique to subdivide neural expression patterns. Nature Methods. 2011;8(3):253–259. pmid:21297621
- View Article
- PubMed/NCBI
- Google Scholar
72. Neufeld TP, De La Cruz AFA, Johnston LA, Edgar BA. Coordination of growth and cell division in the Drosophila wing. Cell. 1998;93(7):1183–1193. pmid:9657151
- View Article
- PubMed/NCBI
- Google Scholar
73. Tworoger M, Larkin MK, Bryant Z, Ruohola-Baker H. Mosaic analysis in the Drosophila ovary reveals a common Hedgehog- inducible precursor stage for stalk and polar cells. Genetics. 1999;. pmid:9927465
- View Article
- PubMed/NCBI
- Google Scholar
74. Collins RT, Linker C, Lewis J. MAZe: A tool for mosaic analysis of gene function in zebrafish. Nature Methods. 2010;7(3):219–223. pmid:20139970
- View Article
- PubMed/NCBI
- Google Scholar
75. Muñoz-Jiménez C, Ayuso C, Dobrzynska A, Torres-Mendéz A, Ruiz PdlC, Askjaer P. An efficient FLP-based toolkit for spatiotemporal control of gene expression in Caenorhabditis elegans. Genetics. 2017;206(4):1763–1778. pmid:28646043
- View Article
- PubMed/NCBI
- Google Scholar
76. Wang W, Warren M, Bradley A. Induced mitotic recombination of p53 in vivo. Proceedings of the National Academy of Sciences. 2007;104(11):4501–4505.
- View Article
- Google Scholar
77. Meijering E. Cell segmentation: 50 years down the road. IEEE Signal Processing Magazine. 2012;29(5):140–145.
- View Article
- Google Scholar

[ref1] 1. Oates AC, Gorfinkiel N, González-Gaitán M, Heisenberg CP. Quantitative approaches in developmental biology; 2009. Available from: http://www.nature.com/articles/nrg2548.

[ref2] 2. Muzzey D, van Oudenaarden A. Quantitative time-lapse fluorescence microscopy in single cells. Annual Review of Cell and Developmental Biology. 2009;25(1):301–327. pmid:19575655
View Article
PubMed/NCBI
Google Scholar

[3] View Article

[4] PubMed/NCBI

[5] Google Scholar

[ref3] 3. Stelzer EHK. Light-sheet fluorescence microscopy for quantitative biology. Nature Methods. 2014;12(1):23–26.
View Article
Google Scholar

[7] View Article

[8] Google Scholar

[ref4] 4. Truong TV, Supatto W. Toward high-content/high-throughput imaging and analysis of embryonic morphogenesis. Genesis. 2011;49(7):555–569. pmid:21504047
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref5] 5. Aghaeepour N, Finak G, Hoos H, Mosmann TR, Brinkman R, Gottardo R, et al. Critical assessment of automated flow cytometry data analysis techniques. Nature Methods. 2013;10(3):228–238. pmid:23396282
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref6] 6. Chen X, Hasan M, Libri V, Urrutia A, Beitz B, Rouilly V, et al. Automated flow cytometric analysis across large numbers of samples and cell types. Clinical Immunology. 2015;157(2):249–260. pmid:25576660
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref7] 7. Pyne S, Maier LM, Lin TI, Wang K, Rossin E, Hu X, et al. Automated high-dimensional flow cytometric data analysis. Proceedings of the National Academy of Sciences. 2009;106(21):8519–8524.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref8] 8. Bernstein BE, Brown M, Johnson DS, Liu XS, Nussbaum C, Myers RM, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biology. 2008;9(9):R137. pmid:18798982
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref9] 9. Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biology. 2007;8(2). pmid:17291332
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref10] 10. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012;9(4):357–9. pmid:22388286
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref11] 11. Trapnell C, Pachter L, Salzberg SL. TopHat: Discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–1111. pmid:19289445
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref12] 12. Costes SV, Daelemans D, Cho EH, Dobbin Z, Pavlakis G, Lockett S. Automatic and quantitative measurement of protein-protein colocalization in live cells. Biophysical Journal. 2004;86(6):3993–4003. pmid:15189895
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref13] 13. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols. 2015;10(6):845–858. pmid:25950237
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref14] 14. Carpenter AE, Jones TR, Lamprecht MR, Clarke C, Kang IH, Friman O, et al. CellProfiler: Image analysis software for identifying and quantifying cell phenotypes. Genome Biology. 2006;7(10):R100. pmid:17076895
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref15] 15. Paintdakhi A, Parry B, Campos M, Irnov I, Elf J, Surovtsev I, et al. Oufti: An integrated software package for high-accuracy, high-throughput quantitative microscopy analysis. Molecular Microbiology. 2016;99(4):767–777. pmid:26538279
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref16] 16. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. Fiji: An open-source platform for biological-image analysis; 2012. Available from: http://www.nature.com/articles/nmeth.2019.

[ref17] 17. Sommer C, Straehle C, Kothe U, Hamprecht FA. Ilastik: Interactive learning and segmentation toolkit. In: Proceedings—IEEE International Symposium on Biomedical Imaging. 2011. p. 230–233. Available from: http://ieeexplore.ieee.org/document/5872394/.

[ref18] 18. Jug F, Pietzsch T, Preibisch S, Tomancak P. Bioimage informatics in the context of Drosophila research. Methods. 2014;68(1):60–73. pmid:24732429
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref19] 19. Sbalzarini IF. Seeing is believing: Quantifying is convincing: Computational image analysis in biology. Advances in Anatomy, Embryology, and Cell Biology. 2016;219:1–39. pmid:27207361
View Article
PubMed/NCBI
Google Scholar

[63] View Article

[64] PubMed/NCBI

[65] Google Scholar

[ref20] 20. Schindelin J, Rueden CT, Hiner MC, Eliceiri KW. The ImageJ ecosystem: An open platform for biomedical image analysis; 2015. Available from: http://doi.wiley.com/10.1002/mrd.22489.

[ref21] 21. Simpson IT, Price DJ. Pax6; a pleiotropic player in development; 2002. Available from: http://doi.wiley.com/10.1002/bies.10174.

[ref22] 22. Parody TR, Muskavitch MAT. The pleiotropic function of Delta during postembryonic development of Drosophila melanogaster. Genetics. 1993;135(2):527–539. pmid:8244012
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref23] 23. Shilo BZ, Raz E. Developmental control by the Drosophila EGF receptor homolog DER; 1991. Available from: https://www.sciencedirect.com/science/article/pii/016895259190261N.

[ref24] 24. Xu T, Rubin GM. Analysis of genetic mosaics in developing and adult Drosophila tissues. Development. 1993;117(4):1223–37. pmid:8404527
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref25] 25. Xu T, Rubin GM. The effort to make mosaic analysis a household tool. Development. 2012;139(24):4501–4503. pmid:23172911
View Article
PubMed/NCBI
Google Scholar

[78] View Article

[79] PubMed/NCBI

[80] Google Scholar

[ref26] 26. Newsome TP, Asling B, Dickson BJ. Analysis of Drosophila photoreceptor axon guidance in eye-specific mosaics. Development. 2000;127(4):851–60. pmid:10648243
View Article
PubMed/NCBI
Google Scholar

[82] View Article

[83] PubMed/NCBI

[84] Google Scholar

[ref27] 27. Theodosiou NA, Xu T. Use of FLP/FRT system to study Drosophila development. Methods. 1998;14(4):355–365. pmid:9608507
View Article
PubMed/NCBI
Google Scholar

[86] View Article

[87] PubMed/NCBI

[88] Google Scholar

[ref28] 28. Struhl G, Basler K. Organizing activity of wingless protein in Drosophila. Cell. 1993;.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref29] 29. Halfar K, Rommel C, Stocker H, Hafen E. Ras controls growth, survival and differentiation in the Drosophila eye by different thresholds of MAP kinase activity. Development. 2001;128(9):1687–96. pmid:11290305
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref30] 30. Tomlinson A, Struhl G. Delta/Notch and Boss/Sevenless signals act combinatorially to specify the Drosophila R7 photoreceptor. Molecular Cell. 2001;7(3):487–95. pmid:11463374
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref31] 31. Yang L, Baker NE. Role of the EGFR/Ras/Raf pathway in specification of photoreceptor cells in the Drosophila retina. Development. 2001;128(7):1183–91. pmid:11245584
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref32] 32. Huang J, Wu S, Barrera J, Matthews K, Pan D. The Hippo signaling pathway coordinately regulates cell proliferation and apoptosis by inactivating Yorkie, the Drosophila homolog of YAP. Cell. 2005;122(3):421–434. pmid:16096061
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref33] 33. Thompson BJ, Cohen SM. The Hippo pathway regulates the bantam microRNA to control cell proliferation and apoptosis in Drosophila. Cell. 2006;126(4):767–774. pmid:16923395
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref34] 34. Atkins M. Drosophila genetics: The power of genetic mosaic approaches. In: Methods Mol. Biol. vol. 1893. Humana Press, New York, NY; 2019. p. 27–42. Available from: http://link.springer.com/10.1007/978-1-4939-8910-2_2.

[ref35] 35. Enomoto M, Siow C, Igaki T. Drosophila as a cancer model. In: Advances in Experimental Medicine and Biology. vol. 1076. Springer, Singapore; 2018. p. 173–194. Available from: http://link.springer.com/10.1007/978-981-13-0529-0_10.

[ref36] 36. Germani F, Bergantinos C, Johnston LA. Mosaic analysis in Drosophila. Genetics. 2018;208(2):473–490. pmid:29378809
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref37] 37. Dai W, Peterson A, Kenney T, Burrous H, Montell DJ. Quantitative microscopy of the Drosophila ovary shows multiple niche signals specify progenitor cell fate. Nature Communications. 2017;8(1):1244. pmid:29093440
View Article
PubMed/NCBI
Google Scholar

[119] View Article

[120] PubMed/NCBI

[121] Google Scholar

[ref38] 38. Bernasek SM, Lachance JFB, Peláez N, Bakker R, Navarro HT, Amaral LAN, et al. Ratio-based sensing of two transcription factors regulates the transit to differentiation. bioRxiv. 2018; p. 430744.

[ref39] 39. Ghiglione C, Jouandin P, Cérézo D, Noselli S. The Drosophila insulin pathway controls Profilin expression and dynamic actin-rich protrusions during collective cell migration. Development. 2018;145(14):dev161117. pmid:29980565
View Article
PubMed/NCBI
Google Scholar

[124] View Article

[125] PubMed/NCBI

[126] Google Scholar

[ref40] 40. Li K, Baker NE. Regulation of the Drosophila ID protein Extra macrochaetae by proneural dimerization partners. Elife. 2018;7.
View Article
Google Scholar

[128] View Article

[129] Google Scholar

[ref41] 41. Bacia K, Petrášek Z, Schwille P. Correcting for spectral cross-talk in dual-color fluorescence cross-correlation spectroscopy. ChemPhysChem. 2012;13(5):1221–1231. pmid:22344749
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref42] 42. Elangovan M, Wallrabe H, Chen Y, Day RN, Barroso M, Periasamy A. Characterization of one- and two-photon excitation fluorescence resonance energy transfer microscopy. Methods. 2003;29(1):58–73. pmid:12543072
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

[ref43] 43. Mort RL. Quantitative analysis of patch patterns in mosaic tissues with ClonalTools software. Journal of Anatomy. 2009;215(6):698–704. pmid:19840025
View Article
PubMed/NCBI
Google Scholar

[139] View Article

[140] PubMed/NCBI

[141] Google Scholar

[ref44] 44. Helmuth JA, Paul G, Sbalzarini IF. Beyond co-localization: Inferring spatial interactions between sub-cellular structures from microscopy images. BMC Bioinformatics. 2010;11(1):372. pmid:20609242
View Article
PubMed/NCBI
Google Scholar

[143] View Article

[144] PubMed/NCBI

[145] Google Scholar

[ref45] 45. Shivanandan A, Radenovic A, Sbalzarini IF. MosaicIA: An ImageJ/Fiji plugin for spatial pattern and interaction analysis. BMC Bioinformatics. 2013;14(1):349. pmid:24299066
View Article
PubMed/NCBI
Google Scholar

[147] View Article

[148] PubMed/NCBI

[149] Google Scholar

[ref46] 46. Furusawa C, Suzuki T, Kashiwagi A, Yomo T, Kaneko K. Ubiquity of log-normal distributions in intra-cellular reaction dynamics. Biophysics. 2005;1:25–31. pmid:27857550
View Article
PubMed/NCBI
Google Scholar

[151] View Article

[152] PubMed/NCBI

[153] Google Scholar

[ref47] 47. Beal J. Biochemical complexity drives log-normal variation in genetic expression. Engineering Biology. 2017;1(1):55–60.
View Article
Google Scholar

[155] View Article

[156] Google Scholar

[ref48] 48. Teicher H. Identifiability of finite mixtures. The Annals of Mathematical Statistics. 1963;34(4):1265–1269.
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref49] 49. Rosvall M, Axelsson D, Bergstrom CT. The map equation. European Physical Journal. 2009;.
View Article
Google Scholar

[161] View Article

[162] Google Scholar

[ref50] 50. Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

[ref51] 51. Kamada T, Kawai S. An algorithm for drawing general undirected graphs. Information Processing Letters. 1989;31(1):7–15.
View Article
Google Scholar

[167] View Article

[168] Google Scholar

[ref52] 52. van der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, et al. scikit-image: image processing in Python. PeerJ. 2014;. pmid:25024921
View Article
PubMed/NCBI
Google Scholar

[170] View Article

[171] PubMed/NCBI

[172] Google Scholar

[ref53] 53. Nobuyuki Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics. 1979;.

[ref54] 54. Bugarski M, Mansouri M, Niemann A, Rizk A, Berger P, Ziegler U, et al. Segmentation and quantification of subcellular structures in fluorescence microscopy images using Squassh. Nature Protocols. 2014;9(3):586–596. pmid:24525752
View Article
PubMed/NCBI
Google Scholar

[175] View Article

[176] PubMed/NCBI

[177] Google Scholar

[ref55] 55. Peláez N, Gavalda-Miralles A, Wang B, Navarro HT, Gudjonson H, Rebay I, et al. Dynamics and heterogeneity of a fate determinant during transition towards cell differentiation. Elife. 2015;4. pmid:26583752
View Article
PubMed/NCBI
Google Scholar

[179] View Article

[180] PubMed/NCBI

[181] Google Scholar

[ref56] 56. Zinchuk V, Zinchuk O, Okada T. Quantitative colocalization analysis of multicolor confocal immunofluorescence microscopy images: Pushing pixels to explore biological phenomena. Acta Histochemica et Cytochemica. 2007;40(4):101–111. pmid:17898874
View Article
PubMed/NCBI
Google Scholar

[183] View Article

[184] PubMed/NCBI

[185] Google Scholar

[ref57] 57. Arsenovic PT, Mayer CR, Conway DE. SensorFRET: A standardless approach to measuring pixel-based spectral bleed-through and FRET efficiency using spectral imaging. Scientific Reports. 2017;7(1). pmid:29142199
View Article
PubMed/NCBI
Google Scholar

[187] View Article

[188] PubMed/NCBI

[189] Google Scholar

[ref58] 58. Kim D, Curthoys NM, Parent MT, Hess ST. Bleed-through correction for rendering and correlation analysis in multi-colour localization microscopy. Journal of Optics. 2013;15(9). pmid:26185614
View Article
PubMed/NCBI
Google Scholar

[191] View Article

[192] PubMed/NCBI

[193] Google Scholar

[ref59] 59. McMullen PD, Morimoto RI, Amaral LAN. Physically grounded approach for estimating gene expression from microarray data. Proceedings of the National Academy of Sciences. 2010;107(31):13690–13695.
View Article
Google Scholar

[195] View Article

[196] Google Scholar

[ref60] 60. Gaudette L, Japkowicz N. Evaluation methods for ordinal classification. In: Lecture Notes in Computer Science. vol. 5549 LNAI. Springer, Berlin, Heidelberg; 2009. p. 207–210. Available from: http://link.springer.com/10.1007/978-3-642-01818-3_25.

[ref61] 61. Nguyen TM, Wu QMJ. Gaussian mixture-model-based spatial neighborhood relationships for pixel labeling problems. IEEE Transactions on Systems, Man, and Cybernetics. 2012;42(1):193–202. pmid:21846606
View Article
PubMed/NCBI
Google Scholar

[199] View Article

[200] PubMed/NCBI

[201] Google Scholar

[ref62] 62. Gambis A, Dourlen P, Steller H, Mollereau B. Two-color in vivo imaging of photoreceptor apoptosis and development in Drosophila. Developmental Biology. 2011;351(1):128–134. pmid:21215264
View Article
PubMed/NCBI
Google Scholar

[203] View Article

[204] PubMed/NCBI

[205] Google Scholar

[ref63] 63. Dourlen P, Levet C, Mejat A, Gambis A, Mollereau B. The Tomato/GFP-FLP/FRT method for live imaging of mosaic adult Drosophila photoreceptor cells. Journal of Visualized Experiments. 2013;79.
View Article
Google Scholar

[207] View Article

[208] Google Scholar

[ref64] 64. Fisher YE, Yang HH, Isaacman-Beck J, Xie M, Gohl DM, Clandinin TR. FlpStop, a tool for conditional gene control in Drosophila. Elife. 2017;6.
View Article
Google Scholar

[210] View Article

[211] Google Scholar

[ref65] 65. Wu JS, Luo L. A protocol for mosaic analysis with a repressible cell marker (MARCM) in Drosophila. Nature Protocols. 2007;1(6):2583–2589.
View Article
Google Scholar

[213] View Article

[214] Google Scholar

[ref66] 66. Zhou Q, Neal SJ, Pignoni F. Mutant analysis by rescue gene excision: New tools for mosaic studies in Drosophila. Genesis. 2016;54(11):589–592. pmid:27696669
View Article
PubMed/NCBI
Google Scholar

[216] View Article

[217] PubMed/NCBI

[218] Google Scholar

[ref67] 67. Heffern E, Perrimon N, Hohl AM, del Valle Rodriguez A, Bakal C, Bonvin M, et al. The twin spot generator for differential Drosophila lineage analysis. Nat Methods. 2009;6(8):600–602. pmid:19633664
View Article
PubMed/NCBI
Google Scholar

[220] View Article

[221] PubMed/NCBI

[222] Google Scholar

[ref68] 68. Yu HH, Kao CF, He Y, Ding P, Kao JC, Lee T. A complete developmental sequence of a Drosophila neuronal lineage as revealed by twin-spot MARCM. PLoS Biology. 2010;8(8):39–40.
View Article
Google Scholar

[224] View Article

[225] Google Scholar

[ref69] 69. Denes AS, Caussinus E, Affolter M, Kanca O, Percival-Smith A. Raeppli: a whole-tissue labeling tool for live imaging of Drosophila development. Development. 2013;141(2):472–480. pmid:24335257
View Article
PubMed/NCBI
Google Scholar

[227] View Article

[228] PubMed/NCBI

[229] Google Scholar

[ref70] 70. Hadjieconomou D, Rotkopf S, Alexandre C, Bell DM, Dickson BJ, Salecker I. Flybow: Genetic multicolor cell labeling for neural circuit analysis in Drosophila melanogaster. Nature Methods. 2011;8(3):260–266. pmid:21297619
View Article
PubMed/NCBI
Google Scholar

[231] View Article

[232] PubMed/NCBI

[233] Google Scholar

[ref71] 71. Hampel S, Chung P, McKellar CE, Hall D, Looger LL, Simpson JH. Drosophila Brainbow: a recombinase-based fluorescence labeling technique to subdivide neural expression patterns. Nature Methods. 2011;8(3):253–259. pmid:21297621
View Article
PubMed/NCBI
Google Scholar

[235] View Article

[236] PubMed/NCBI

[237] Google Scholar

[ref72] 72. Neufeld TP, De La Cruz AFA, Johnston LA, Edgar BA. Coordination of growth and cell division in the Drosophila wing. Cell. 1998;93(7):1183–1193. pmid:9657151
View Article
PubMed/NCBI
Google Scholar

[239] View Article

[240] PubMed/NCBI

[241] Google Scholar

[ref73] 73. Tworoger M, Larkin MK, Bryant Z, Ruohola-Baker H. Mosaic analysis in the Drosophila ovary reveals a common Hedgehog- inducible precursor stage for stalk and polar cells. Genetics. 1999;. pmid:9927465
View Article
PubMed/NCBI
Google Scholar

[243] View Article

[244] PubMed/NCBI

[245] Google Scholar

[ref74] 74. Collins RT, Linker C, Lewis J. MAZe: A tool for mosaic analysis of gene function in zebrafish. Nature Methods. 2010;7(3):219–223. pmid:20139970
View Article
PubMed/NCBI
Google Scholar

[247] View Article

[248] PubMed/NCBI

[249] Google Scholar

[ref75] 75. Muñoz-Jiménez C, Ayuso C, Dobrzynska A, Torres-Mendéz A, Ruiz PdlC, Askjaer P. An efficient FLP-based toolkit for spatiotemporal control of gene expression in Caenorhabditis elegans. Genetics. 2017;206(4):1763–1778. pmid:28646043
View Article
PubMed/NCBI
Google Scholar

[251] View Article

[252] PubMed/NCBI

[253] Google Scholar

[ref76] 76. Wang W, Warren M, Bradley A. Induced mitotic recombination of p53 in vivo. Proceedings of the National Academy of Sciences. 2007;104(11):4501–4505.
View Article
Google Scholar

[255] View Article

[256] Google Scholar

[ref77] 77. Meijering E. Cell segmentation: 50 years down the road. IEEE Signal Processing Magazine. 2012;29(5):140–145.
View Article
Google Scholar

[258] View Article

[259] Google Scholar

Figures

Abstract

Author summary

Introduction

Results

Quantification of nuclear fluorescence levels

Bleedthrough correction

Automated annotation of clones

Manual assessment of annotation performance

Synthetic benchmarking of annotation performance

Discussion

Materials and methods

Genetics and microscopy of Drosophila eye imaginal discs

Characterization of fluorescence bleedthrough

Clone annotation algorithm

Statistical comparison of fluorescence levels

Simulated cell growth and recombination

Generation of synthetic microscopy data

Synthetic benchmarking of annotation performance

Data and software availability

Supporting information

S1 Fig. Example clones in the larval fly eye.

S2 Fig. Using background pixels to characterize bleedthrough contributions in the foreground.

S3 Fig. Training a clone annotation model.

S4 Fig. Label assignment using a trained clone annotation model.

S5 Fig. Comparison of automated annotation with manually assigned labels.

S6 Fig. Simulated growth of a synthetic cell culture.

S7 Fig. Tunable generation of synthetic microscopy data.

S8 Fig. Fraction of nuclei correctly labeled during synthetic benchmarking.

S9 Fig. Spatial context is most informative for large clones with ambiguous fluorescence.

References