Flow cytometry is unique in its ability to measure, analyze, and study vast numbers of homogenous or heterogeneous cell populations. Today’s flow cytometers are capable of processing 100,000 cells/s and analyzing up to 70,000 cells/s with this threshold getting higher every year. Through the use of various reporter stains, fluorescence-based being the most popular, to target surface and/or intracellular markers, operators can gather valuable datasets for analyzing diverse biochemical processes. These can range from basic cell phenotyping, identifying changes in DNA content during cell cycles, phosphorylation states, and cell-cell interaction just to name a few. We have previously reported on the uses of flow cytometry.
Up until the early 21st century, flow cytometry operators were only capable of measuring a few fluorescent markers at a time. Significant advances have been made in fluorophore and instrument technologies such that operators can now quantify up to 18 markers at a time. Additional access to stains and targets allows for significant increases in one’s ability to refine cell populations and isolate target subgroups of interest. However, every increased parameter significantly increases the quantity of data points to consider and the complexity of analysis. This introduces challenges for reproducibility, cohesive analysis, and subsequently the ability to generate meaningful discoveries. Here, we discuss the processes for analyzing flow cytometry data and addressing these concerns.
Prior to data plotting and analysis, flow cytometry datasets must undergo pre-processing to remove technical interference and poor quality data. One of the first major obstacles is fluorescence spectra overlap. While each channel is specifically designed to read at a certain wavelength, readings typically reflect peak emission intensity given off by all sample fluorophores in tandem with that wavelength. Signal compensation is when flow cytometers account for this signal spillover by running a representative stains for only one fluorophore at a time. This establishes a baseline measurement that is representative for that fluorophore at that specific wavelength. These values alongside total fluorescence signal can then be used to produce a spillover matrix to generate compensated data.
Figure 1. Enzo Life Science’s spectra viewer depicts emission spectral overlap between Alexa Fluor 647 (Em: 655 nm) and Propidium Iodide (Em: 617 nm).
Conducting data transformation is also necessary to mitigate negative downstream influences of sample asymmetry and overlapping cell populations in future data analysis. This is done by choosing a set of common transformation parameters amongst multiple samples to ensure proper representation on a common scale fit for comparison. Types of transformations commonly used include logarithmic, linear-logarithmic hybrids such as Logicle, biexponential, and power transformations such as Box-Cox.
Another universal problem is technical variation in sample acquisition. This may result from inconsistencies in reagents such as degradation over time, lot production variation, and sample handling as common examples. Further variation is also introduced when using different instruments, which could introduce different setting configurations and sensitivity. These interferences make biologically equivalent populations difficult to match across different samples. For this reason, data normalization is conducted as a means to eliminate as much variation as possible. Doing this allows for reasonable comparison of data sets acquired over extended periods of time.
Representing flow cytometry data can be done in many different ways. For most traditional flow cytometry experiments, compensated values are plotted with single or multivariate parameters. This is followed by selective gating of target populations for further study, which will be explored later in this TechNote.
When only one parameter is considered, univariate histograms are the most common means to represent data. This is generally represented with the relative fluorescence on the X-axis and the number of events on the Y-axis. Data plotted in this format is used to evaluate cells that possess a physical expression of target markers at notable levels, indicated as positive datasets. An example can be seen when the same target analyte is analyzed through multiple experimental samples across different treatment parameters. Cells can undergo different treatment regimens for different types of drugs or quantities of a drug to assess percent changes in cellular analyte expression.
Figure 2. Flow cytometry-based profiling of autophagy with Enzo Life Science’s CYTO-ID? Autophagy detection kit (ENZ-51031): Control (red-lined peak) uninduced and 10 μM Tamoxifen (ALX-550-095) treated (blue-filled peak) Jurkat cells (T-cell leukemia) were used. After 18 hours treatment, cells were loaded with CYTO-ID? Green Detection Reagent, then analyzed without washing by flow cytometry. Results are presented by histogram overlays. Control cells were stained as well but mostly display low fluorescence. In the samples treated with 10 μM Tamoxifen for 18 hours, CYTO-ID? Green dye signal increases about 2-fold, indicating that Tamoxifen causes an increase in autophagy in Jurkat cells.
With two parameters, or bivariate analysis, operators are provided the freedom of displaying any statistic they wish on the Y- or X-axis. This could be fluorescence, FSC, or SSC depending on what the end user is trying to measure. The frequency of events within a specific subpopulation is no longer represented through the Y-axis. Rather, through a dot plot where the frequency of events is gauged by the degree of density. Although monochromatic dot plots are a popular way to represent color, this has a distinct limitation when it comes to distinguishing rare subgroup overshadowed by another population of interest. Using multicolor dot plots is one way to overcome this problem as different colors can be used to signify different levels of cell density and provide a sense of proportion. Another way is to use contour lots, which is similar to topographical mapping. Contours identify regions of equal cell density to show the relative frequency of populations present within a region of interest.
Figure 3. Detection of hypoxia and oxidative stress levels in cultured human HeLa and HL-60 cells using Enzo Life Science’s ROS-ID? Hypoxia/Oxidative stress detection kit (ENZ-51042). Cells were treated with hypoxia inducer (DFO) and ROS inducer (pyocyanin). Numbers in each quadrant reflects the percentage of cells (population). Results indicate that hypoxia and oxidative stress dye are specific.
Traditional flow cytometry data-analysis uses manual gating as a way to selectively refine a plotted population based off experimental parameters of interest. These distinct expression patterns identify which subset of cells to continue analyzing and which ones not to. To ensure that gating is done in the most accurate way possible, several considerations should be made for samples prior to manual gating.
Distinguishing populations of interest based on forward and side scatter properties is an important consideration to make in many gating strategies. As previously mentioned in our other TechNote, forward scatter (FSC) scans along the path of the laser and gives an estimation of a cell’s size. Side scatter (SSC) on the other hand analyzes samples 90° relative to the laser for internal complexity and granularity of cells. Plotting these parameters is especially helpful when isolating dead cells, which generally display cell physiological differences in regards to shape and size.
are also known to display a greater degree of fluorescence for several reasons. One is due to auto fluorescence attributed to increased exposure of intracellular cyclin ring compounds, such as NAPDH, and aromatic amino acids. Cell death also promotes greater exposure to cell debris, which creates unwanted nonspecific binding to detection reagents. These problems can be accounted for with appropriate use of unstained controls and implementing fluorescence minus one (FMO) controls to enhance the reliability of gate placement.
Manual gating must also account for the presence of doublets in sample readouts. Doublets are single events that occur when particles pass through interrogation points so close together that instruments are unable to distinguish them as individual events. For events to be recorded in flow cytometers, fluorescence signal needs to drop down to baseline after the signal has been elicited. For events to be recorded, fluorescence signal needs to drop down to baseline after signal has been elicited. But in doublet formation, the pulse is unable to return to this baseline due to the proximity of particles and machines classify these as single events. In other words, two particles that should be registered as independent events are actually identified as a single event through the flow cytometer. Doublets can be identified in scatter plots as the area width of a doublet is larger than a single cell, but heights are very close to one another. Using measurements such as FSC and SSC are the most convenient choices to linearly scale samples for identification
Challenges with Larger Parameter Analysis
Manual gating is capable of both efficient and effective isolation of target populations for univariate or bivariate parameter populations. But as previously mentioned, increasing parameters also commensurately increases the complexity of analysis. Major problems begin to arise once multifactorial designs are incorporated, where traditional gating processes prove insufficient for these higher quantity datasets
Since more parameters results in a significantly increased quantity of datasets that need to be investigated. Using traditional gating methods alone results in a repetitive and time-consuming process. This further complicated when considering some studies that require phenotypes to be analyzed in tandem to resolve high-dimensional differences between cells. Given this complexity, there is a notable possibility for misidentification, overestimation, or underestimation of target populations.
With today’s flow cytometers capable of processing samples with 18+ parameters, some suggest that that traditional manual gating methods severely underutilize the analytical power of today’s technology. Having recognized this need to simply and streamline our analysis more than ever, methods have been made to adapt analysis to rely more heavily on robotic instrumentation and automation.
With today’s flow cytometers capable of processing samples with 18+ parameters, some suggest that that traditional manual gating methods severely underutilize the analytical power of today’s technology. Having recognized this need to simply and streamline our analysis more than ever, methods have been made to adapt analysis to rely more heavily on robotic instrumentation and automation. Automated gating is based on mathematical modeling of fluorescence intensity distributions in cell populations. This can generally be done using two approaches.
Supervised learning methods requires two data elements with an operator provided dataset. The provided dataset contains all the necessary training data that an algorithm needs to set a relationship between the explanatory (labels or markers) and dependent (classification based on labels) variables. Providing the software these underlining parameters sets gating strategies to reflect what the user’s experimental intentions are and generates predefined cell populations. Unsupervised methods do not require a dependent variable or predefined reference for classification. The process works on clustering algorithms, which treats all variables are treated the same way and is intent on identifying events within the same cluster. Clusters will contain groups of events more similar to other events from other clusters. In general, this is due to target variables being unknown or having only been recorded for too small a number of cases. In short, supervised learning methods are designed to classify samples while unsupervised learning methods are more optimized for aggregating samples corresponding to similar pattern exhibition.
As technologies and automated approaches become more streamlined, the possibility for these processes to replace manual gating methods becomes more of a reality. For instance, the implementations of technologies such as Cytobank allows teams to utilize more homogenous gating methods and access for centralized data management in between experiments. Changes such as these help propel biomedical research and clinical setting to be better prepared for the greater demand of complex high-throughput analysis.
Enzo’s portfolio offers a complete range of products for flow cytometry, with over 3,000 antibodies
to detection kits to monitor oxidative stress
, cell senescence
, and much more! For more information on Enzo’s collaborative works in flow cytometry
, please check our full list of application notes on this subject. For any further questions and concerns regarding any of our products, please reach out to our Technical Support
team. We are here to assist you with your flow cytometry solutions!