
Assess concordance between tables of observed and predicted phenotypes
Source:R/concordance.R
concordance_from_tables.RdCalculates concordance from two data frames in standard phenotype table format, one with observed phenotype data and the other with predicted phenotype calls (e.g. output from AMRrules).
Usage
concordance_from_tables(
pheno_table,
pheno_pred_table,
true_SIR_col = "pheno_eucast",
true_ecoff_col = "ecoff",
pred_SIR = "clinical category",
pred_ecoff = "phenotype",
sample_col = "id",
drug_col = "drug",
measure_col = "mic"
)Arguments
- pheno_table
A tibble or data frame containing real (observed) phenotype data, in the format output by
import_pheno().- pheno_pred_table
A tibble or data frame containing predicted phenotype calls, e.g. AMRrules predictions imported in AMRgen using
import_amrrules_predictions().- true_SIR_col
Character. Name of the column containing S/I/R calls interpreted from real data against clinical breakpoints (default
"pheno_eucast").- true_ecoff_col
Character. Name of the column containing WT/NWT calls interpreted from real data against ECOFF (default
"ecoff").- pred_SIR
Character. Name of the column containing S/I/R calls predicted from genotypes (default
"clinical category").- pred_ecoff
Character. Name of the column containing WT/NWT calls predicted from genotypes (default
"phenotype").- sample_col
Character. Name of the column containing sample identifiers. This must be the same in both tables (default
"id").- drug_col
Character. Name of the column containing drug agent identifiers. This must be the same in both tables (default
"drug").- measure_col
Character. Name of the column containing observed MIC or disk measurements for plotting. Valid options are
"mic"or"disk"(default"mic").
Value
A named list with the following elements:
obs_pred: A copy of thepheno_tablewith the predictions frompheno_pred_tablemerged in.metrics: A tibble listing concordance metrics, comparing observed vs predicted R and/or NWT calls, for all drugs with matched samples in both input tables.drugs_R: A long-form tibble listing the number of samples with each combination of observed and predicted binary calls for R, for all drugs with matched samples in both input tables.drugs_NWT: A long-form tibble listing the number of samples with each combination of observed and predicted binary calls for NWT, for all drugs with matched samples in both input tables.plot_R: A list containing two-panel plots summarising observed vs predicted clinical categories, one for each drug.plot_NWT: A list containing two-panel plots summarising observed vs predicted WT/NWT categories, one for each drug.plot_R_dist: A list containing plots of the assay value distributions (MIC or disk zones), coloured by S/I/R prediction, one for each drug.plot_NWT_dist: A list containing plots of the assay value distributions (MIC or disk zones), coloured by WT/NWT prediction, one for each drug.