CMEIAS v4.0: Advanced computational tools of bioimage informatics software designed to strengthen microscopy based approaches for understanding microbial ecology

Dazzo, Frank B, Y Yanni, J Liu, K Kwatra, A Jain, C Gross, N Philips, C Monosmith, K Klemmer, Z Ji, B Niccum, N DeSilva, D McGarrell, I Folland, P Jha, S Lundback, W Tan, D Stawkey, A Jones, D Gusfa, K Prater, N Bakir, R. Sexton, M. Shears, A. Makhoul and S. Handlesman
Dept. of Microbiology & Molecular Genetics, Michigan State Univ., East Lansing MI 48824

Presented at the All Scientists Meeting and Investigators Field Tour (2017-10-06 to 2017-10-07 )

A major challenge in microbial ecology is to develop reliable methods of computer-assisted microscopy that can process and analyze complex digital images of actively growing microbial cells, populations and communities (including unculturable microbes) that reveal important insights on their phenotypic characterization without the need for laboratory cultivation, and at spatial scales enabling analysis of single cells and microcolonies with their interacting neighbors occupying local ecological niches within biofilms. To address this challenge, our team of microbiologists, mathematicians and computer scientists has been developing and releasing a comprehensive suite of bioimage informatics software technologies that strengthen quantitative microscopy-based approaches to support microbial ecology research. The phenotypic information gained by CMEIAS analysis of digital images of microbial cells, populations and communities can bridge with modern genotypic technologies to fill gaps of information on the in situ ecology of microbial assemblages. Our bioimage informatics software, called CMEIAS (Center for Microbial Ecology Image Analysis System) consists of new and improved computing tools for image acquisition, processing and segmentation, object analysis and classification, data processing, statistical analysis and exploratory data mining. This research offers unique opportunities for students to participate in creative and exploratory studies of scientific software development for quantitative microscopy and computational biology with wide applications in microbial ecology. When finalized, the software tools and their comprehensive documentations are released as free downloads at our MSU CMEIAS website <http://cme.msu.edu/cmeias/>.

The first released version of CMEIAS operates within the UTHSCSA ImageTool v. 1.28 host in a PC running Windows 32-bit and 64-bit operating systems. It features components for analysis of foreground objects in digital images, reports their numerical abundance and 27 attributes of their size, shape and luminosity, provides a 1-dimensional object classifier using a single, user-specified measurement attribute and 15 upper bin limits (16 classes), and also features a unique semi-automatic supervised hierarchical tree classifier for all major (cocci, regular rods, spirals, curved rods, prosthecates, branched and unbranched filaments) and most minor (U-rods, ellipsoids, clubs and rudimentary branched rods) microbial morphotypes (for 96% of the genera in Bergey’s Manual 9th Ed). The CMEIAS Morphotype Classifier uses pattern recognition rules operating in 14-dimensional space with 97% accuracy to report the richness and abundance of each microbial morphotype present within images of dynamically growing communities, and can add up to 5 additional (rare) morphotypes if present in the community under investigation.

CMEIAS-ImageTool features several image processing routines. The host program includes contrast / brightness adjustment, object sharpening, background subtraction, median smoothing, color-to-grayscale conversion, stack averaging, histogram stretch, neighborhood convolution, and a manual pixel editor. The CMEIAS software suite includes 2 image-editing tools to facilitate the segmentation of objects in images prior to analysis. A stand-alone CMEIAS Color Segmentation software application has been developed, documented and released with improved technologies for accurate segmentation of foreground objects within complex RGB digital images where color differentiation is important, e.g., ecophysiological studies of organisms using immunofluorescence microscopy, fluorescence in situ hybridization, enzyme cytochemistry, and fluorescent gene reporter strains. It uses a novel, color-based brightness threshold algorithm applied to an interactively sampled training set of the local color pixels representing the foreground objects of interest, then implements user-adjustable tolerance settings to identify each object’s boundary differentiated by its unique color range, and finally creates a new RGB segmented output image containing these color-classified foreground objects in a noise-free background. Its performance operates with near 99% accuracy based on ground truth tests of microbes in complex color images of environmental samples. This color segmentation software also features image processing routines to fill vacant holes within objects, a dilate/erode routine to remove pixel noise, split images into their RGB / HIS / YUV color model channels, pseudocolor object features differing in brightness to assist in defining their contour (especially useful to eliminate halos surrounding objects in fluorescence microscopy), and set thresholds to filter object size and brightness. Our second image processing tool (CMEIAS Object Separation) is a plugin for CMEIAS-IT that automatically splits touching objects within microbial cell aggregates in thresholded images. Its unique algorithm intelligently avoids erroneous splits within the same cell and deletion of foreground pixels that reduce the size of the segmented measured objects, unlike the watershed segmentation algorithm commonly featured in other image processing software.

Several other components have been introduced into the next major CMEIAS-IT upgrade (ver. 4.0). At its core is the CMEIAS upgrade of the UTHSCSA ImageTool host program with numerous improvements in user-friendly operation, enhanced graphical interface design to set the preferences / open the images / select the measurement features for image analysis, 12 new toolbar shortcuts of commonly used routines for image calibration / segmentation / analysis / classification, a minimum – maximum object size filter, expanded size range of image zoom ratios and object annotations, upgrades of 7 manual object analysis routines performed on user-sampled points and polygon areas of interest within the image (polygon size / luminosity metrics, luminosity-profile along a user-defined line, histogram of pixel brightness, individual point X,Y location / brightness / linear distance, and object count and tag), and an expanded Help menu providing quick access to CMEIAS user support documents and audio-visual demonstration tools.

Many new analytical features have been introduced into the CMEIAS dynamic library-linked (dll) extension plugins that operate within the CMEIAS-ImageTool ver. 4.0 upgrade. These plugin tools represent significant transformations providing breakthrough technologies including modules of image analysis with substantially increased in situ discrimination of microbial (i) biodiversity based on the statistically significant heterogeneity and clustered classification of their morphological signatures defined at 0.2 um spatial resolution; (ii) abundance; (iii) ecophysiology, and (iv) spatial/landscape ecology. The educational scaffolding to support these modules includes well illustrated user manuals, interactive tutorial scripts, audio-visual demos, and hotlinks at the CMEIAS website to their formal descriptions and applications in scientific publications.

The upgraded Object Analysis plugin can extract up to 83 metrics (21 from ImageTool, 62 from CMEIAS) that discriminate shape (30), size (32), luminosity (15) and spatial (6) features for each object. This plugin assists studies to reveal quantitative information on microorganisms at single-cell resolution, e.g., understanding bacterial individuality to explore the mechanisms through which ecological systems work, how individual cells interact with each other and their environment, and tests of the emerging theory of individual-based modeling and ecology which predict that individual cell variation is a major driver of population structure and function. This plugin is ideal for studies of microbial autecology when based on cell tracking methods using specific fluorescent molecular probes (e.g., immunofluorescence, FISH) or autofluorescent cell components (e.g., GFP, F420). Also, several of the newly introduced measurement attributes of object shape / size / luminosity can enhance the discrimination of microcolony architecture within immature biofilms and provide insights on their interactions and colonization behavior. This plugin’s metrics of abundance can analyze object density and biomass intensity to examine community dominance / evenness / conditional rarity, seasonal productivity and food-web dynamics. Notable among its size measurement attributes are 9 biovolume formulas that are optimally adapted to measure cell body size accurately for each microbial morphotype classified by CMEIAS. The ecophysiology metrics of the object analysis plugin provide quantification of microbial surface area:volume adaptations, changes in body size reflecting their nutrient resource apportionment and utilization, filament elongation to resist protozoan bacteriovory, a universal allometric scaling formula to compute their biomass C and biovolume-weighted metabolic rates, luminosity for in situ variations in biofilm surface texture and color-differentiated changes in cell viability, intensity of gene expression, cell-cell communication and specific enzyme activities.

A new Cumulative Object Analysis plugin features 54 discriminating metrics of landscape ecology, global abundance, and spatial patterns reported for all foreground objects (e.g., cells or microcolony biofilms) within the same image. Its metrics of global landscape ecology provide insights on the relationships among spatial architecture, processes / scale / complexity / connectivity, and nutrient apportionments of microcolony biofilm patches within the entire landscape domain. Examples of discriminating CMEIAS metrics for landscape ecology include % substratum coverage, areal and relative porosities, patch area distribution statistics, mean circular intensity, edge density, mean / maximum diffusion radii, and indices of patch length / shape / cohesion / connectivity. Its global abundance attributes measure cumulative microbial lengths revealing insights on intensity of bacteriovory selection pressures and morphological stress responses, cumulative biovolume / biosurface area / biomass carbon / spatial density / object counts providing insights on their nutrient apportionment and productivity, and calculation of dilution-adjusted object concentrations in measured sample volumes for individual populations of morphotypes or for entire communities.

The CMEIAS Spatial Analysis module has been designed to explore the microbial biogeography of biofilm assemblages across multiple spatial scales, and includes metrics for plot-less spatial point pattern, plot-based quadrat-lattice spatial aggregation / dispersion, geostatistical autocorrelation of user-selected variables and fractal geometry. A major use of this module is to test the null hypothesis of spatial randomness for the 2-dimensional position of organisms within immature biofilm landscapes, from which the type and intensity of their colonization behavior can be deduced. When applied to images of microbial biofilms, these analyses can statistically test if their patterns of spatial distribution deviate from complete randomness (as is most often the case), and then provide statistically defendable predictions of their in situ spatial scale and intensities of cooperative vs. conflicting interactions within the landscape domain. Spatial ecology attributes in the Object Analysis plugin can report the georeferenced location (Cartesian X, Y coordinates) of each object’s centroid relative to a landmark origin, shortest linear distances of each cell to its 1st and 2nd nearest neighbors, cumulative empirical distribution function of the 1st nearest neighbor distance for each cell, and a Cluster Index indicating the intensity to which each cell is clustered near its neighbors. These point-pattern data can be used to test the strength of spatially explicit, cooperative vs. conflicting colonization behavior of the microorganisms in situ. Geostatistical analysis of these georeferenced data can then be examined to mathematically model the spatially autocorrelated intensities of the selected Z-variate for each cell (e.g., cell biovolume, cluster index), test for anisotropy and preferential angular orientation of their spatial autocorrelation, measure the range of effective separation distances that define the real-world maximal spatial scale that individual cells or microcolony patches can exert to influence neighboring microbial cell-cell interactions and resource ecology, and produce statistically defendable pseudocolored, kriging maps of their heterogeneity in user-selected Z-variate intensities interpolated over the entire georeferenced landscape domain, even in areas not sampled. The spatial attributes available in the Cumulative Object Analysis plugin can analyze the local spatial density and dispersion of foreground objects within quadrat-lattice landscapes using area-weighted abundance of cell concentration / cumulative length / biovolume / biomass C / biosurface area, quadrat-based metrics of distances between random points and nearest neighbors for spatial aggregation / dispersion analyses, and postings of each quadrat’s X, Y centroid and its geometric centroid weighted by local object density.

Several computing tools have been introduced to support the CMEIAS Spatial Analysis module. One provides the area of the user-defined polygon when needed for plot-based spatial analysis of foreground objects within a selected area of interest that is smaller than the entire image. A second script can automate multiple user-defined iterations of a measurement task on the same image when needed to perform the spatial analysis (e. g, the Hopkins & Skellum point-pattern analysis). A third script can superimpose a user-defined color grid on the image for quadrat-based spatial dispersion analysis. Also, a stand-alone CMEIAS Quadrat Maker software application has been developed and released to assist in optimizing the grid dimensions that divide landscape images of microbial biofilms into smaller, constant size contiguous quadrats for high-resolution plot-based spatial pattern analysis of their local density, then produce an indexed image with labels identifying each gridded quadrat in the spatial domain, and finally transform a copy of the original landscape image into size-optimized individual quadrat images saved with column-row numbered annotation defined by the optimized grid that are ready for image stack construction followed by plot-based and geostatistical spatial distribution analyses.

A powerful CMEIAS JFrad software program has been developed, documented and released to analyze complex biofilm architectures based on the uniqueness of their self-similar fractal geometry. It uniquely features algorithms to compute 11 different fractal dimensions along microcolony biofilm coastlines in single images using a semi-automated wizard design, and in multiple images using a fully automated batch analysis, and then presents its quantitative analytical results optimally designed for statistical data mining, especially useful when the discriminating rank of fractal methods is not known in advance of analysis. Interestingly, several of the fractal algorithms available in this software can also discriminate spatial patterns of individual cells in the biofilm domain. These CMEIAS JFrad outputs of fractal geometry provide quantitative insights of the complexity of microbial biofilm architecture and type / intensity of colonization behavior resulting from the scale-dependent heterogeneous fractal variability in limiting resource partitioning. They reflect the high efficiency that cells position themselves when faced with the interactive forces of microbial coexistence to maximize and compete for their apportionment of nutrient resources on a local scale within the surrounding environment.

The upgraded CMEIAS-IT Object Classification module now features 5 different object classifiers. The Single Variable Classifier (as described earlier) sorts foreground objects based on division of a scale defined by a single user-selected metric of size, shape, luminosity or spatial proximity. Any of 60 object analysis metrics can be utilized for this 1-dimensional object classifier. It also now has an expanded range that can include up to 20 upper class limits to define the decision boundaries separating each bin class, and is supported by 20 predefined calibration files to assist in optimizing the bin widths of those upper class limits. The unique CMEIAS-2 Morphotype Classifier (as described earlier) remains functional in this plugin upgrade as a hierarchical tree classifier. Its new design combines the object shape analysis plus pattern recognition mathematics of morphotype classification into a single semi-automated step (rather than run them separately) and an improved design of its reported output table. A CMEIAS-3 supervised hierarchical tree classifier has been added with ability to subclassify each cell’s morphotype in the image into its Operational Morphological Units (OMU), based on a pre-established set of decision boundaries that define the morphological signatures of each cell at 0.2 um resolution. Its rules of classification use a multilinear matrix of upper bin limits optimized as the least-overlapping borders that separate bin clusters for each measurement attribute used in the classification scheme. These upper bin limits are coded into two (“default” and “user-defined”) size border files that the user selects to operate the CMEIAS-3 OMU classifier. The “default” size border file is derived from statistical best-cut cluster analyses of microbial cell size data in the published knowledge base of taxonomically described microorganisms, and it currently distinguishes 890+ OMUs. The “user-defined” size border file is custom built to optimize the classification of all statistically distinguishable clusters of OMU’s present in the community under investigation, regardless of whether they have been described previously.

Two new object multiclassifiers (CMEIAS-4 and CMEIAS-5) are designed to combine the morphotype and OMU classifiers with the additional metrics of the CMEIAS object analysis plugin, respectively. These analytical combinations report each individual organism’s morphotype / OMU class together with any additional user-selected combination of 60+ discriminating characteristics of its shape, size, luminosity or spatial point-pattern attributes to analyze community biodiversity, resilience, succession, stress adaptation and spatial ecology. These data can then be evaluated by exploratory morphotype-weighted cluster analyses and data mining techniques to compute their statistically defined subclassification based on the (dis)similarities between microbial community structures. Various measures of ecophysiology (e.g., allometric scaling, metabolic activity, membrane integrity and viability, resource ecology, spatial positioning reflecting colonization behavior, nutrient uptake efficiency, predatory bacteriovory, dominance vs. conditional rarity, community succession and resilience following environmental perturbation, species-area and distance decay relationships) can be statistically evaluated from the output data acquired by these individual object classification-analysis modules. These two new multiclassifiers are streamlined for efficient operation and significantly expand the ability of CMEIAS to compare the diversity and ecophysiology of microbial communities and microbiomes in situ at single-cell resolution all in a single analysis.

Several new CMEIAS computing tools have been built to support its object classification module. A powerful Size Border Cluster Analysis Tool had been developed to assist in optimizing the pattern recognition rules for the CMEIAS-3 OMU classifier. It performs a 1,000-iterated simulation of an exploratory, multilinear unsupervised cluster analysis of each data array of discriminating object size/shape attributes, and then ranks the statistically significant schemes with least overlapping decision boundaries of upper class limits that separate statistically valid clusters for each array of metrics. Each upper class limit in highly ranked cluster schemes is then further validated using the Single Variable object classifier, and the resultant cluster model of optimized “best cut top solution” upper bin limits is then coded into the “user-defined” size border file to operate the OMU classifier. This flexible design facilitates the ability to recognize and classify the full range of statistically validated OMU diversity that is optimized for the specific community under investigation. Also supporting the classifiers are an improved set of easily recognized pseudocolor assignments for each morphotype in the rendered classification output image, and an Object Label edit tool that interactively reassigns objects to the best fit class when they are located at the real-world continuum between classification borders, creates new morphotype classes if necessary, and reconstructs new images of morphotype-specific or OMU-specific populations derived from community images containing multiple classified morphotypes distinguished by their assigned pseudocolors. This latter feature is also helpful when building and validating the user-defined size border file for the community examined.

A CMEIAS Data Preparation program has been developed to concatenate CMEIAS object analysis and classification data extracted from multiple images within the same community dataset, and builds the reformatted input tables of data for further analyses in other ecological statistics programs. A CMEIAS Data Toolpack com-addin application is being built to perform numerous ecological statistics on CMEIAS population and community analysis data. It operates within Microsoft Excel to compute descriptive statistics, optimizes bin widths for frequency distribution analyses, plots tolerance envelopes of confidence intervals to determine if the sample size (number of images and cells within them) is sufficient to estimate diversity, constructs ranked and relative abundance plots of community structure, and computes numerous alpha and beta diversity indices and distance coefficients to compare the richness, dominance, evenness, diversity, and dis(similarity) of the analyzed microbial communities.

Recent collaborative microbial ecology studies using CMEIAS include examination of the autecological biogeography and intensities of colonization behavior of rhizobial biofertilizer inoculant strains that significantly promote the growth and grain yield of rice (the world’s most important crop), how substratum physicochemistry impacts on freshwater biofilm architectures, the diversity / ecophysiology / spatial ecology of microbial communities developed on field-grown corn leaves with or without the genetically engineered BT insecticide, and shifts in community structure of human vaginal microflora in health and bacterial vaginosis disease.

In summary, CMEIAS-based applications of bioimage informatics can help to fill major gaps in our understanding of microbial ecology by providing well-documented, accurate, robust and user-friendly computing tools that extract ecologically important, quantitative phenotypic information from digital images of microbes, at multiple spatial scales relevant to their diversity, abundance, ecophysiology and in situ spatial distribution without the need for cultivation. The awesome computational power of CMEIAS software technologies for computer-assisted microscopy adds an exciting new dimension to microbial community analysis at single-cell and microcolony resolutions, and it is especially valuable when bridged to other methods of genotypic and polyphasic analyses. We maintain a CMEIAS project website (http://cme.msu.edu/cmeias/) that provides access to freely-available copyrighted programs, support documents including refereed journal publications, thoroughly illustrated user manuals, help topics search files, audio-visual demos, several scripts of interactive training tutorials with accompanying test images, a periodically updated webpage entitled “Publications using CMEIAS” with hyperlinked entries describing CMEIAS and its worldwide use in research applications, and contact information.

Get poster
Back to meeting | Show |