BACKGROUNDNumerous studies of interferon-gamma release assays (IGRAs) and tuberculin skin testing (TST) to assess latent tuberculosis infection have been published without a framework to understand the extent to which these two tests should agree. Analyzing the causes of variability in agreement levels is crucial.METHODSA mathematical model of agreement between dichotomous tests was used to understand variations in the level of agreement between IGRA and TST results. The effect of cut-off point selection on agreement was also explored using the model. Model-based predictions are illustrated using published literature.RESULTSAnalyses of IGRAs and TST that depart from model predictions are an indication that surrogates of prevalence of Mycobacterium tuberculosis infection may have been improperly measured or analyzed. For fixed prevalence, the extent of agreement between tests depends upon cut-off point selection. Changing cut-off points while holding prevalence constant may lead to increasing, decreasing or even no change in agreement.CONCLUSIONSResearchers have recognized that experimental error, clinical risk and prevalence of non-tuberculous mycobacteria contribute to study-to-study variability. In the present study, we show that paradoxical findings in certain IGRA studies can be explained by the proposed mathematical model. Re-analysis of existing studies may lead to overlooked hypotheses. Future IGRA studies will require epidemiologically well-characterized populations.