Contents lists available at SciVerse ScienceDirect

Journal of Immunological Methods

journal homepage: www.elsevier.com/locate/jim

Technical note

Defining ELISpot cut-offs from unreplicated test and ^ c^Ma*

control wells

Neal Alexander a,*< Annette Fox b,d, Vu Thi Kim Lien d, Tao Dong e, Laurel Yong-Hwa Lee e, Nguyen Le Khanh Hang d, Le Quynh Maid, Peter Horby b,c

a London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, United Kingdom b Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Programme, Hanoi, Vietnam c Centre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 7LJ, UK d National Institute for Hygiene and Epidemiology, 1 Yersin Street, Hanoi, Vietnam

e MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Oxford, United Kingdom

ARTICLE INFO

ABSTRACT

Article history:

Received 13 September 2012

Received in revised form 27 February 2013

Accepted 27 February 2013

Available online 7 March 2013

Keywords: ELISpot Cut-off Standardized

In the absence of replication of wells, empirical criteria for enzyme-linked immunospot (ELISpot) positivity use fixed differences or ratios between spot forming units (SFU) counts between test and control. We propose an alternative approach which first identifies the optimally variance-stabilizing transformation of the SFU counts, based on the Bland-Altman plot of the test and control wells. The second step is to derive a positivity threshold from the difference in between-plate distribution functions of the transformed test and control SFU counts. This method is illustrated using 1309 assay results from a cohort study of influenza in Vietnam in which some, but not all, of the peptide pools have clear tendencies for SFU counts to be higher in test than control wells.

© 2013 Elsevier B.V. All rights reserved.

1. Introduction

Since it was first described in 1983, the enzyme-linked immunospot (ELISpot) assay has become a widely used method for the detection of antigen-specific cytokine-secreting T cells (Czerkinsky et al., 1983; Versteegen et al., 1988), and is now a standard assay for measuring the cell-mediated immune response to vaccines in clinical trials. The requirement for immunological assays used in vaccine trials to be rigorously validated has resulted in much work to maximize the sensitivity and specificity of ELISpot assays, ensure their reproducibility,

Abbreviations: DMSO, dimethyl sulphoxide; ELISpot, enzyme-linked immunospot; ECDF, empirical cumulative distribution function; IAVI, International AIDS Vaccine Initiative; PBMC, peripheral blood mononuclear cell; PHA, phytohemagglutinin; ROC, receiver operating characteristic; SFU, spot forming units.

* Corresponding author.

E-mail addresses: neal.alexander@lshtm.ac.uk (N. Alexander), tao.dong@imm.ox.ac.uk (T. Dong).

0022-1759/$ - see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/! 0.1016/j.jim.2013.02.014

minimize inter-laboratory and inter-operator variability and to automate and standardize the counting of the spot forming units (SFU) (Vaquerano et al., 1998; Schmittel et al., 2000; Mwau et al., 2002; Janetzki et al., 2004, 2005, 2008; Cox et al., 2005; Lehmann, 2005; Samri et al., 2006; Maecker et al., 2008). However, criteria for defining a positive response have been subject to considerable debate and controversy (Mwau et al., 2002; Hudgens et al., 2004; Jamieson et al., 2006; Jeffries et al., 2006; Moodie et al., 2010; Slota et al., 2011).

Since the spot counts in the negative control wells, which contain no stimulating analyte, are predictive of the background count in the wells that contain peptide (the experimental wells) it makes sense to use comparisons between the negative control and the experimental wells to define responsiveness (Hudgens et al., 2004). This approach is further supported by the variability in background spot counts between and within laboratories and individuals, and even within samples depending on their handling, which mean that universal cut-offs are generally not credible (Hudgens et al., 2004; Cox et al., 2005). One commonly used technique to define a positive or negative response is to consider a well positive if it contains a

pre-defined number of SFU above the count in the negative control well, with values of 10-50 SFU/106 PBMC often being used (Scholvinck et al., 2004). This method has the disadvantage of a higher false positive probability in plates with high background, since a chance variation of, for example, 10 spots is more likely with high counts than low counts. A common alternative is to consider a well positive if its number of SFU is above a pre-defined multiple of the control, i.e. a criterion based on a ratio rather than a difference. This has the opposite disadvantage: higher false positive probability in plates with low background counts. For example, if the criterion is a four-fold ratio, and the negative control has two spots, an experimental well will be considered positive if it has > eight spots, and this is much more likely to occur by chance than a value of 800 spots where the control well has 200. These considerations have led many groups to apply a combination of absolute and fold difference (Larssonet al., 1999; Russell etal., 2003; Jeffries et al., 2006). For example, the T-SPOT manual recommends a difference of at least 6 if the negative control has 5 or fewer spots, and a ratio of at least 2 when it has 6 or more (Oxford Immunotec, 2006). Additionally, a threshold value (e.g. at least 11 SFU/106 PBMC in the experimental well) is also sometimes applied to provide a threshold of responsiveness that is considered to have biological significance. Similarly, an upper limit on the number of spots in the negative control well may be imposed, e.g. 10 in the case of T-SPOT and IAVI (International AIDS Vaccine Initiative)(Gill et al., 2010). These cut-offs and thresholds are often defined with reference to ELISpot responses in a known negative population and are therefore often referred to as empirical methods (Moodie et al., 2006). By contrast, statistical methods have been developed which use the variation between replicate control wells to define positivity thresholds (Hudgens et al., 2004; Moodie et al., 2006). However, when a wide range of peptides is being examined it may be impractical to include replication of the peptide and negative control wells. In the current paper we develop a positivity criterion for such plate layouts, in the context of a study of cell mediated immunological response to influenza.

2. Methods and results

We present a method which uses within-plate differences between test and control wells, and a positivity threshold based on their statistical distribution over plates. The method relies on the principle that pools can only be reliably declared positive when the test counts tend to be larger than the negative control ones. The method is illustrated using data from a cohort study in Vietnam (Horby et al., 2012).

2.1. Study population

The cohort study included 932 individuals aged between 5 and 90 years. PBMC samples were taken to measure the prevalence of T-cell responsiveness to seasonal and avian influenza peptides in order to determine the protective effect of pre-existing T-cell responses. Institutional review boards in the United Kingdom and Vietnam approved the study and all subjects provided written informed consent.

2.2. ELISpot assay

2.2.1. Antigens

The complete proteome of H3N2 (A/NewYork/388/2005), the haemagglutinin and neuraminidase of H1N1 (A/Hong Kong/ 1134/98 and A/New York/228/2003) and the haemagglutinin of H5N1 (A/Vietnam/CL26/2004) were represented as 14-20 amino acid peptides overlapping by 10 amino acids. Peptides representing each protein were tested as either 1 or 2 pools containing between 24 and 52 peptides, giving a total of 20 pools. The final culture concentration of individual peptides was 2-3 |ag/ml. Phytohemagglutinin (Sigma-Aldrich) was used at 10 |ag/ml.

2.2.2. PBMC preparation and ¡FN-y ELISpot

Heparinized venous blood was received within eight hours of collection and immediately overlayed onto Lymphoprep then centrifuged to isolate PBMCs. PBMCs were either tested in ELISpot immediately or cryopreserved in fetal calf serum containing 10% DMSO. ELISpot was performed according to published protocols (Lalvani et al., 1997). In brief, 250,000 PBMC per well were incubated with peptide pools, PHA or media-only (negative control) overnight. ELISpot plates were scanned using a Cellular Technology Ltd. Series 3A Analyzer. Spots were then counted using ImmunoSpot 3.1 software. Spot definition settings were as follows: sensitivity 170; minimum spot size 0.0142 mm2; maximum spot size 0.4399 mm2; oversized spots estimated; spot separation 1.00; diffuse spot process on; diffuseness 20; gradient off; overdeveloped area handling active; background balance on; background balance 30; fill holes off. Audit spots was set 'on' such that automated counting was subject to manual review whereby areas selected automatically could be deselected if they appeared to be something other than a spot from IFN-y release. PHA wells were counted using more sensitive settings. Spot forming unit (SFU) counts were automatically transferred from an automated ELISpot counter (Cellular Technology Limited) to a Microsoft Access database, resulting in 1309 records. Of these, 758 were tested immediately and 551 cryopreserved. We present analysis of all samples irrespective of this status, although supplementary figures show that test SFU counts exceeded those of control more strongly in those samples processed immediately.

2.3. Statistical methods

The approach is to first identify a suitable data transformation and then, where feasible, choose a threshold value to define positive wells. This will be illustrated by two of the H1N1 pools from the above study.

As mentioned above, thresholds based on differences between spot counts tend to result in false positive at high values, but those based on ratios — or, equivalently, differences on the log scale — result in the opposite problem. This is because the variance of the untransformed counts increases with the mean value, and this trend is reversed by the logarithmic transformation. The property ofthe variance changing with the mean — whether increasing or decreasing — is known as heteroscedasticity. Since the logarithmic transformation can be seen as the limit of a series of power transformations

(Tukey, 1957) — e.g. square root, cube root, and so on — we seek the power which minimizes heteroscedasticity. More specifically, for each power we plot the difference of the transformed values against their average — a Bland & Altman plot (Bland and Altman, 1986) — and minimize the chi-squared statistic of a test for heteroscedasticity in the corresponding regression (Breusch and Pagan, 1979). We also used the studentized version of the test, which is more robust to non-Gaussian variation (Koenker, 1981), and the results remained identical to at least two decimal places. Once the power transformation has been selected, the regression is not used further.

For the 20 pools, the selected powers ranged from 0.23 to 0.31, mean 0.27. In other words, the optimal transformations were close to fourth root (power = 1/4). Fig. 1 shows the Bland and Altman plots for the first haemagglutinin pool, and the second neuraminidase pool. These plots also show i) the test wells positive on the T-SPOT criteria (see Introduction), and ii) the control wells which would have been positive on the same criteria, had the test and control status been reversed, hereafter referred to as pseudo-positive. For haemagglutinin, the T-SPOT-positive test wells greatly outnumber the pseudopositive control wells (247:46), but this is not the case for neuraminidase (58:59). By quartile on the horizontal axis, the proportions positive on the T-SPOT criteria are: 0, 23, 26 and 32% for haemagglutinin and 0, 0, 6 and 16% for neuraminidase.

To select a threshold value for defining positive wells, we use the principle that test minus control values should, on average, be larger than control minus test. Otherwise, there is no evidence of a 'signal' over the 'noise' of control variation, and any positivity threshold is dubious. To select the threshold we compare the empirical cumulative distribution functions (ECDFs) of i) test-control for those plates with test > control and ii) control-test for those with control > test. The ECDF of a sample is simply the proportion of the data points which lie at or below a given value. The difference between ECDFs can be used to discriminate between a mixture of two distributions. In particular, the value which maximizes the difference in ECDFs also maximizes the probability of correct classification (Stoller, 1954). Hence, for the current purpose, we choose the threshold to be the value which maximizes the difference between the above two ECDFs. Pools whose difference over control exceeds this value are declared positive. In principle it is possible for this maximum difference in ECDFs to occur at more than one value on the horizontal axis. Hence we define the threshold, more precisely, to be the lowest such value on the horizontal axis.

This is shown in Fig. 2 for the two selected pools. Greater data values shift the ECDF to the right, making it lower at any given point on the horizontal axis. For haemagglutinin, the ECDFs of test-minus-control and control-minus-test are much more widely separated than for neuraminidase. For haemagglutinin, the maximum difference in ECDFs is 0.22 and occurs at a transformed test-minus-control value of 1 (i.e. a value greater than 1 is considered positive). For neuraminidase the maximum difference in ECDFs is 0.11 and occurs at a test-minus-control value of 0.64. Applying these threshold values to Fig. 1 gives 291 positive test wells and 63 pseudo-positive control wells for haemagglutinin. The corresponding numbers for neuraminidase are much closer — 222 and 204 — suggesting that reliable discrimination is not possible for neuraminidase. By quartile of the

M CM OJ

jfjgj^A«* So

• • v, J

* • J

-0.2 0.0 0.2 0.4 0.6 Iog10 mean of power-transformed test & control

T3 0) E

a) S o

* * o0 ; > o

• • *

-0.2 0.0 0.2 0.4 0.6 Iog10 mean of power-transformed test & control

Fig. 1. Bland-Altman plots for the power-transformed counts for haemagglutinin (upper panel) and neuraminidase (lower panel). Each vertical axis is the difference between the transformed test and control values. Each horizontal axis is the average of the transformed values, on a log scale. In the main part of each plot, to the left, the points are arranged in curved lines because of the original data are integers (whole numbers). In particular, the curved line closest to the top left corner of each plot contains samples with a zero control result (but varying test counts), and the line closest to the bottom left corner contains those with a zero test result (but varying control counts). The solid points are those which would be positive on the T-SPOT criteria, either for test minus control, or 'pseudo-positives' in which control minus test would meet those same criteria. The power which minimizes the relation between the variance and mean of the differences in transformed counts is 0.26 for haemagglutinin and 0.27 for neuraminidase. Towards the right of each plot there is a histogram of the differences between the transformed values of test and control, using the same vertical axis. For haemagglutinin, but not for neuraminidase, there is a visible pattern of positives predominating over pseudo-positives.

transformed mean, the proportions positive for haemagglutinin are: 0, 68, 13 and 15%, and for neuraminidase are 22, 50, 12 and 11%.

The maximum difference between the two ECDFs is also used by the Kolmogorov-Smirnov test for differences between distributions. A large p value from this test would again suggest

Fig. 2. Empirical cumulative distribution functions (ECDF, vertical axis) for the power-transformed positive differences (horizontal axis) of i) test over control (solid line) and ii) control over test (dashed line). The upper panel shows haemagglutinin and the lower one neuraminidase. Each ECDF is the proportion of data points below the corresponding value on the horizontal axis. Each vertical line indicates the greatest separation of the two ECDFs. For haemagglutinin, the ECDFs are further apart than for neuraminidase, indicating a greater tendency for the haemagglutinin test wells to have more SFU than the negative control.

that reliable identification of positive samples is not possible, although the converse is not necessarily true. In other words, the p value being less than 5%, for example, does not imply that reliable identification will be possible. Rather, the hypothesis test screens out examples for which no reliable identification can be expected (Armitage et al., 2001, page 472). Over all 20 pools, the p values ranged from 2 x 10-16 to 0.67, those for haemagglutinin and neuraminidase being 2 x 10-9 and 0.02 respectively. Hence for some pools there is no tendency for test to exceed control, as opposed to the other way round, and in such cases trying to assign a threshold would be futile.

Fig. 3. For each panel (upper haemagglutinin, lower neuraminidase), the horizontal axis is the difference between the power-transformed test and control count values. For each such value, the vertical axis shows the corresponding tail probability, i.e. the proportion of the upper and lower tails which is found in the upper tail. For example a proportion of 75% means that the upper tail (test exceeding control by a given amount) has three times as many data points as the lower tail (control exceeding test by that amount). Values near to 50% indicate control SFU counts being, on average, about as high as test. The solid vertical lines are the 95% exact binomial confidence interval for each proportion. These become wide at high values because they are based on few data points. The dashed vertical line on each figure are at the same values as the corresponding panel of Fig. 2. Haemagglutinin, but not neuraminidase, shows a clear 'signal' of test over control.

This analysis can be expressed in terms of the probability of correctly identifying which pool is test and which is control, when this status is unknown. Suppose we have i) one person's test and control results x and y (possibly on a transformed scale), x being the larger, but without knowing whether x or y is test, and ii) the distribution of previous test-minus-control values (with the experimental conditions

known). We expect larger values to result from the test condition, so suppose our rule is to conclude that x is from the test condition if it exceeds the smaller one by more than a value k. The conditional probability that x is the test sample, given that x — y > k, is

Prob(x is test¡x-y > k) =

Prob(x is test & x-y > k)

Prob(x-y > k) Prob(x is test & x-y > k)

Prob(x is test & x-y > k) + Prob(x is control& x-y > k)

This last expression is the area of the upper tail of the distribution (above a test-minus-control value of k) divided by the sum of the upper and lower tails (above k or below —k). If the control value rarely exceeds the test by k, then this probability will be high. This argument is applied to the cohort data in Fig. 3. For haemagglutinin, the test value is likely to exceed control, for a wide range of threshold values. For neuraminidase, however, the control value is about as likely to exceed test as the other way round.

Results from simulated data confirm that the proportion of samples identified as positive increases with the excedent test mean over the control mean (see Supplementary Material). These results also suggest that the current approach may be conservative in identifying positives. There is also a high degree ofvariation in the estimated proportion positive, which to some extent results from the high variation in the input data. All analysis uses R version 2.11, and the custom-written functions are also included as supplementary material.

3. Discussion

Replication of ELISpot test and control wells has been recommended (Moodie et al., 2010) although it reduces the number of proteins that can be tested for given resources. Existing statistical methods utilize this replication to define positivity criteria objectively based on within-plate, between-replicate, variation (Moodie et al., 2012). In the absence of replication, the current approach relies on between-plate variation in a sizable dataset from a given population. The principle is that positivity should tend to give test wells larger counts than control wells.

One problem with existing empirical cut-offs is that large absolute differences are likely to happen by chance when spot counts are high. Log transformation reverses the problem because large fold changes from control can occur by chance at low spot counts. In statistical terms, the original and transformed datasets both have heteroscedasticity, i.e. variance associated with the mean. One solution is to use a transformation which is less strong than the logarithm. The square root transformation may suffice, for example, when the same parasite slide is read twice. This corresponds to the theoretical minimum variation, described by the Poisson distribution of homogeneous counts (Alexander et al., 2007). The current approach selects the power transformation which minimizes heteroscedasticity in the Bland & Altman plot. All of the pools in the example dataset were found to have optimal powers close to M, i.e. fourth root transformation, which is between the square root and logarithm in strength.

It was notable that some protein test pools had little or no tendency to exceed the negative (medium) control in terms of spot count. Seeking positive samples is quixotic in these

circumstances. In particular, applying existing empirical criteria to such pools, the number of test wells declared positive barely exceeds the number of control wells which would have been declared positive, had the test/control status been reversed in the analysis. When there is a tendency for the differences of test over control to exceed those of control over test, a positivity cutoff can be chosen by comparing their empirical distribution functions (ECDFs), by analogy with non-parametric discrimination (Stoller, 1954). The value corresponding to the maximum difference between the ECDFs gives the greatest probability of successful classification. In practice, however, false negative and false positive errors may not have equal importance, which would suggest increasing or decreasing the cut-off. This kind of calibration, e.g. by receiver operating characteristic (ROC) curve, would require independent identification of true positive and negative individuals. For influenza it is difficult to identify unexposed people from whom to prepare negative sera for this purpose. Such an approach should be more feasible for other infections such as HIV and tuberculosis.

4. Conclusions

For situations in which replication of wells is infeasible, we highlight problems with positivity criteria based on fixed differences or ratios between test and control wells, which are known as empirical methods. In our example dataset from a large cohort study, we show that some peptide pools can often be positive on such empirical criteria, while having little or no elevation in SFU over the negative control. We propose an alternative approach which uses within-plate differences between test and control wells, and a positivity threshold based on their statistical distribution over plates.

Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.jim.2013.02.014.

Acknowledgments

The authors thank the hamlet health workers who conducted the interviews and surveillance, the Preventive Medicine Centre of Ha Nam Province, and the Ministry of Health of Vietnam for their continuing support of the research collaboration between the Oxford University Clinical Research Unit and the National Institute for Hygiene and Epidemiology.

Funding

This work was supported financially by the United Kingdom Medical Research Council grant number G7508177 to the Tropical Epidemiology Group and by the United Kingdom Wellcome Trust (grants 081613/Z/06/Z and 087982/Z/08/A). AF was supported by the European Union FP7 project "European Management Platform for Emerging and Re-emerging Infectious Disease Entities (EMPERIE)" (no. 223498).

References

Alexander, N., Bethony, J., Correa-Oliveira, R., Rodrigues, L.C., Hotez, P.,

Brooker, S., 2007. Repeatability of paired counts. Stat. Med. 26, 3566. Armitage, P., Berry, G., Matthews, J.N.S., 2001. Statistical Methods in Medical

Research. Blackwell Scientific Publications, Oxford. Bland, M.J., Altman, D.G., 1986. Statistical methods for assessing agreement

between two methods of clinical measurement. Lancet 1, 307.

Breusch, T.S., Pagan, A.R., 1979. A simple test for heteroscedasticity and random coefficient variation. Econometrica 47,1287.

Cox, J.H., Ferrari, G., Kalams, S.A., Lopaczynski, W., Oden, N., D'Souza M, P., 2005. Results of an ELISPOT proficiency panel conducted in 11 laboratories participating in international human immunodeficiency virus type 1 vaccine trials. AIDS Res. Hum. Retroviruses 21, 68.

Czerkinsky, C.C., Nilsson, LA., Nygren, H., Ouchterlony, O., Tarkowski, A., 1983. A solid-phase enzyme-linked immunospot (ELISPOT) assay for enumeration of specific antibody-secreting cells. J. Immunol. Methods 65,109.

Gill, D.K., Huang, Y., Levine, G.L., Sambor, A., Carter, D.K., Sato, A., Kopycinski, J., Hayes, P., Hahn, B., Birungi, J., Tarragona-Fiol, T., Wan, H., Randles, M., Cooper, A.R., Ssemaganda, A., Clark, L., Kaleebu, P., Self, S.G., Koup, R., Wood, B., McElrath, M.J., Cox,J.H., Hural, J., Gilmour,J., 2010. Equivalence of ELlSpot assays demonstrated between major HIV network laboratories. PLoS One 5, e14330.

Horby, P., Mai le, Q., Fox, A., Thai, P.Q., Thi Thu Yen, N., Thanh le, T., Le Khanh Hang, N., Duong, T.N., Thoang, D.D., Farrar, J., Wolbers, M., Hien, N.T., 2012. The epidemiology of interpandemic and pandemic influenza in Vietnam, 2007-2010: the ha nam household cohort study I. Am. J. Epidemiol., 175. Oxford Immunotec, p. 1062.

Hudgens, M.G., Self, S.G., Chiu, Y.L., Russell, N.D., Horton, H., McElrath, M.J., 2004. Statistical considerations for the design and analysis of the ELlSpot assay in HlV-1 vaccine trials. J. Immunol. Methods 288,19.

Jamieson, B.D., Ibarrondo, F.J., Wong, J.T., Hausner, M.A., Ng, H.L., Fuerst, M., Price, C., Shih, R., Elliott, J., Hultin, P.M., Hultin, L.E., Anton, P.A., Yang, O.O., 2006. Transience of vaccine-induced HlV-1-specific CTL and definition of vaccine "response". Vaccine 24, 3426.

Janetzki, S., Schaed, S., Blachere, N.E., Ben-Porat, L., Houghton, A.N., Panageas, K.S., 2004. Evaluation of Elispot assays: influence of method and operator on variability of results. J. Immunol. Methods 291,175.

Janetzki, S., Cox, J.H., Oden, N., Ferrari, G., 2005. Standardization and validation issues of the ELISPOT assay. Methods Mol. Biol. 302, 51.

Janetzki, S., Panageas, K.S., Ben-Porat, L., Boyer, J., Britten, C.M., Clay, T.M., Kalos, M., Maecker, H.T., Romero, P., Yuan, J., Kast, W.M., Hoos, A., 2008. Results and harmonization guidelines from two large-scale international Elispot proficiency panels conducted by the Cancer Vaccine Consortium (CVC/SVl). Cancer Immunol. Immunother. 57, 303.

Jeffries, D.J., Hill, P.C., Fox, A., Lugos, M., Jackson-Sillah, D.J., Adegbola, R.A., Brookes, R.H., 2006. Identifying ELISPOT and skin test cut-offs for diagnosis of Mycobacterium tuberculosis infection in The Gambia. Int. J. Tuberc. Lung Dis. 10,192.

Koenker, R., 1981. A note on studentizing a test for heteroscedasticity. J. Econ. 17,107.

Lalvani, A., Brookes, R., Hambleton, S., Britton, W.J., Hill, A.V., McMichael, AJ., 1997. Rapid effector function in CD8+ memory T cells. J. Exp. Med. 186, 859.

Larsson, M., Jin, X., Ramratnam, B., Ogg, G.S., Engelmayer, J., Demoitie, M.A., McMichael, AJ., Cox, W.I., Steinman, R.M., Nixon, D., Bhardwaj, N., 1999. A recombinant vaccinia virus based ELISPOT assay detects high frequencies of Pol-specific CD8 T cells in HlV-1-positive individuals. AIDS 13, 767.

Lehmann, P.V., 2005. Image analysis and data management of ELISPOT assay results. Methods Mol. Biol. 302,117.

Maecker, H.T., Hassler, J., Payne, J.K., Summers, A., Comatas, K., Ghanayem, M., Morse, M.A., Clay, T.M., Lyerly, H.K., Bhatia, S., Ghanekar, S.A., Maino, V.C., Delarosa, C., Disis, M.L., 2008. Precision and linearity targets for validation of an IFNgamma ELISPOT, cytokine flow cytometry, and tetramer assay using CMV peptides. BMC Immunol. 9, 9.

Moodie, Z., Huang, Y., Gu, L., Hural, J., Self, S.G., 2006. Statistical positivity criteria for the analysis of ELlSpot assay data in HlV-1 vaccine trials. J. Immunol. Methods 315,121.

Moodie, Z., Price, L., Gouttefangeas, C., Mander, A., Janetzki, S., Lower, M., Welters, M.J., Ottensmeier, C., van der Burg, S.H., Britten, C.M., 2010. Response definition criteria for ELISPOT assays revisited. Cancer Immunol. Immunother. 59,1489.

Moodie, Z., Price, L., Janetzki, S., Britten, C.M., 2012. Response determination criteria for ELlSPOT: toward a standard that can be applied across laboratories. Methods Mol. Biol. 792,185.

Mwau, M., McMichael, AJ., Hanke, T., 2002. Design and validation of an enzyme-linked immunospot assay for use in clinical trials of candidate HIV vaccines. AIDS Res. Hum. Retroviruses 18, 611.

Oxford Immunotec, 2006. T-SPOT.TB. An Aid in the Diagnosis of Tuberculosis Infection. Visual procedure guide. For in vitro diagnostic use (Oxford).

Russell, N.D., Hudgens, M.G., Ha, R., Havenar-Daughton, C., McElrath, M.J., 2003. Moving to human immunodeficiency virus type 1 vaccine efficacy trials: defining T cell responses as potential correlates of immunity. J. Infect. Dis. 187, 226.

Samri, A., Durier, C., Urrutia, A., Sanchez, I., Gahery-Segard, H., Imbart, S., Sinet, M., Tartour, E., Aboulker, J.P., Autran, B., Venet, A., 2006. Evaluation of the interlaboratory concordance in quantification of human immunodeficiency virus-specific T cells with a gamma interferon enzyme-linked immunospot assay. Clin. Vaccine Immunol. 13,684.

Schmittel, A., Keilholz, U., Thiel, E., Scheibenbogen, C., 2000. Quantification of tumor-specific T lymphocytes with the ELlSPOT assay. J. lmmunother. 23, 289.

Schölvinck, E., Wilkinson, K.A., Whelan, A.O., Martineau, A.R., Levin, M., Wilkinson, R.J., 2004. Gamma interferon-based immunodiagnosis of tuberculosis: comparison between whole-blood and enzyme-linked immunospot methods. J. Clin. Microbiol. 42, 829.

Slota, M., Lim, J.B., Dang, Y., Disis, M.L., 2011. ELlSpot for measuring human immune responses to vaccines. Expert Rev. Vaccines 10, 299.

Stoller, D.S., 1954. Univariate two-population distribution-free discrimination. J. Am. Stat. Assoc. 49, 770.

Tukey, J.W., 1957. On the comparative anatomy of transformations. Ann. Math. Stat. 28, 602.

Vaquerano, J.E., Peng, M., Chang, J.W., Zhou, Y.M., Leong, S.P., 1998. Digital quantification of the enzyme-linked immunospot (ELlSPOT). Biotechniques 25 (830-4), 836.

Versteegen, J.M., Logtenberg, T., Ballieux, R.E., 1988. Enumeration of lFN-gamma-producing human lymphocytes by spot-ELlSA. A method to detect lymphokine-producing lymphocytes at the single-cell level. J. lmmunol. Methods 111, 25.