ViQi High-Content Analysis Automated Screening Partners Wanted

Training AVIA to run your infectivity assays

In order for AVIA to successfully run assays on a new virus, cell line, or imaging device, a new AI needs to be trained using the cell line, virus, plates, and imaging instrument as will be used routinely for infectivity assays. There are three stages to training, validating, and employing AVIA.

Stage 1: Training and Validation

During stage 1, ViQi will gather initial image data from you to confirm that our AI can be successfully trained for your assay. A single 96-well plate is populated with a cell monolayer and infected at a range of multiplicities of infection (MOIs) from high-MOI infection (i.e., MOI = 3), 10 serial 2-fold MOI dilutions, and a mock infection negative control (MOI = 0).

Plating conditions:

The 12 MOIs are shifted in the bottom half of the plate to avoid any plate-position bias.
Each well is imaged in a (typically) 5x5 grid of non-overlapping images
Brightfield or phase contrast imaging is done with a 20x objective, with the focal plane positioned in the interior of the cells to maximize visualization of subcellular structure.
The whole plate should be imaged immediately after infection and then regularly until CPEs are visible by eye (typically 0, 8, 16 and 24 hpi)

Prior to AI training, the images are split into several separate training and testing sets in order to ensure that AI performance is always determined from wells (technical or biological replicates) that were not used in training. Training progresses in epochs, each being a training cycle where a subset of the training images are used to adjust the AI model. The learning rate, batch size, and other parameters affect how AI training progresses. Models are evaluated at the end of each epoch for their performance on a subset of the training set as well as the reserved external test set. When performance on the test set falls below that of the training set, it means that the model is overtraining to the training data, and is not able to extrapolate well to the unseen test data. The deep neural networks used in AVIA's AI contain millions of free parameters, and with a finite amount of training data, some amount of overfitting to the specific examples in training is inevitable. Occasionally test data shows better performance than training data, indicating that the dropout and regularization training parameters used to suppress overtraining may be too aggressive. The results of this AI training are reported back to the investigator. See below (The AI of AVIA) for additional detail on the AIs we use.

AI is trained on the right side of the plate only and its success on the left side determines feasibility.

A successfully generalized model would be able to predict MOIs that are correlated to the MOIs used in training. A sigmoidal trend in the predictions is expected with a lower plateau at the detection limit of the assay, an upper plateau at the saturation level, and a linear range where the assay is most sensitive to changes in infectivity. The AI makes these predictions on sub-image regions (tiles), and there are several predictions per image, several images per well, and several wells (technical replicates) per dilution. In a real assay, more than one dilution (and its replicates) may fall within the linear range, so these dilutions can also be aggregated into a predicted titer. Aggregating predictions per image or per well allows calculating the variance of the predictions between replicates, which is reported as a range, variance and standard error in the report.

Because the AI generates an infectivity value for each well based on predictions from many sub-image regions, this value is equivalent to a plaque count on a single plate. Plaque counts are typically done with two or three replicates and a range of dilutions to get a one resulting in 50-300 plaques. A TCID50 assay, in contrast, is based on a binary result in each well of a 96-well plate (either infected or not), and so requires more wells at the limiting dilution to get a continuous variable (virus titer) from a set of fully-infected and fully-uninfected wells. In order to get a statistical range for the titer, the limiting dilution's set of wells must be repeated. The combination of smaller dilution steps and multiple wells per dilution replicate means that an AVIA assay can have 8-16 times as many samples per 96-well plate compared to TCID50.

Stage 2: Assay Qualification - Reproducibility

Once training is complete, it’s important to measure reproducibility of the assay across biological replicates. These can be stocks of virus, cell passages, and potentially closely related strains. It is also important at this stage to check negative controls and different plate layouts to ensure the assay's specificity. Lastly, it is important to check results against traditional assays, like plaque or TCID50. The two plate orientations combined with the Stage 1 plate allows cross-checking to understand how AIs trained on two plates predict infectivity on a different plate, as well as the opportunity to combine two or three plates into a single AI training set. In each case, a table summarizes assay performance by characterizing samples from a plate different from that used in AI training and comparing the AI predictions to known virus titers observed with traditional techniques.

The plates below are training plates as above, but with a total of 8 MOIs, additional replicates of the highest MOI and uninfected control, and two different orientations to control for well-to-well variability and bias.

Stage 2 Success Criteria

Comparison of AI performance on biological replicates withheld from training provides an assessment of AI accuracy in predicting known virus titers on never before seen biological replicates. This information is presented in a table with known and predicted MOIs within the assay's pre-determined linear range. The table can be used to assess if the AIs are performing with sufficient reproducibility to advance to Stage 3, or if more biological replicates are required to train the AI on the inherent variability in biological replicates.

Stage 2a: Assay Qualification - Inactivation assays

For inhibitors of virus infection, inactivation assays are designed to measure the extent to which a compound, inactivating antibody, or biocide inhibits virus infection. In these applications, it may be advantageous to train AIs directly on phenotypes of inactivation, rather than relying solely on observing the reduction in the phenotypes of infection. In this approach, all of the wells are infected at a fixed high MOI, and the dilutions used for calibration are that of a known inhibitor, or inactivating antibody. Other treatments can also be included to control for undesired effects on cells, such as cytotoxicity, in the presence or absence of virus. In this way, each sample can be assayed by multiple AIs in a single assay to provide readouts on infection titer, infectivity reduction, cytotoxicity, etc.

Whether or not AIs are trained on inhibition phenotypes, known inhibitors (or heat inactivation) should be used as positive controls when assaying inhibitors in Stage 3, with an uninhibited infection as a negative control.

This can also be an opportunity to explore different modes of action for different inhibitors. For example, inactivating antibodies typically inhibit viral entry, whereas replication inhibitors act on viruses after they enter cells (e.g. nucleoside-analogs as reverse transcriptase inhibitors in antiretroviral drugs). Prior to using AVIA in a screen for inactivation using one or more of these modes, it is important to see if AVIA is an effective detector and discriminator between the different modes.

Stage 3: Assay Plate with unknowns & ongoing verification

It is important to include positive and negative controls with every plate to ensure that the assay is working as expected. For titration assays, the same virus stock from Stage 1 and 2 should be used, or a new stock that has been titered with the initial stock for comparison. A negative control of uninfected cells is included as well. In high precision assays, it is advisable to also include one or more intermediate dilutions of the positive control to act as an additional check on the assay's linear range. When the assay is assessing inhibitor activity, the positive control is a known inhibitor acting on cells infected at high MOI, and the negative control is the uninhibited infection used throughout the plate. The high MOI should be selected based on the top of the linear range reported in the calibration curve in Stage 2.

The range of dilutions chosen for unknown samples is dependent on the sensitivity of the assay (the extent of the linear range reported in Stage 2), as well as the precision demands in the lab. Each well that falls within the linear assay range provides a precise estimate of the titer, so the range of dilutions in the assay depends on the expected range of the unknown titer, and the known linear range of the AI. For example, if the reported linear range is a factor of 10, then 10-fold dilutions are appropriate to ensure that one of them falls within the assay's linear range. It is recommended that the assay be performed with three replicates, but if sample number is a substantial premium relative to precision and reproducibility, then replicates can be reduced. The lower number of replicates and the typically higher linear range of the assay combine to make assay density (independent samples per plate) 8-16 times higher than common TCID50 assays.

Example layout:

1 control, 7 samples with 4 dilutions, 3 replicates

3 log range (103) per sample with 10-fold dilutions

The AI of AVIA

The classification model is a convolutional neural network (CNN), which uses a series of convolution kernels to extract features from input images at increasing levels of abstraction. This model was implemented using Tensorflow. The weights of these kernels are initialized to those pretrained on ImageNet. Bootstrapping on the knowledge acquired from those models allowed the AVIA models to converge faster and at a higher accuracy. After each cycle through the full training data (an epoch) the data is augmented with flips and rotations to increase the scope of the training examples.

Depending on the strength of the morphological signals from the infection and the required accuracy, models of varying complexity are trained. A single CNN based on EfficientNetB0 can be used as a rapid model with modest accuracy as a stand-alone model for training. On the other end of the scale, a total of sixteen CNNs can be trained with different base models and learning parameters (hyperparameters), and their predictions fed into an ensemble classifier such as a support vector machine (SVM) or random forest classifier (RFC). In our experiments, three other CNN architectures were used including MobileNet, InceptionResNetV2, and VGG; and hyperparameters with two options for dropout; and two options for learning rate. Future work will also include tiles with multiple sizes.

In ensemble models, numerical outputs of multiple CNN models (marginal probabilities of a given tile being infected) are fed as input features into an automated feature classifier trainer developed by ViQi. This trainer uses scikit-learn to try several different feature normalization, scoring and selection techniques coupled to several different high-performing classifiers. The trainer optimizes appropriate parameters for the algorithms at each stage and automatically selects the best-performing model and corresponding parameter set.

Other commonly asked questions about AVIA