
Import and process antimicrobial phenotype data from common sources
Source:R/import_pheno.R
import_pheno.RdThis function imports an antibiotic susceptibility testing datasets in formats exported by EBI, NCBI, WHOnet and several automated AST instruments (Vitek, Microscan, Sensititre, Phoenix). It optionally can use the AMR package to interpret susceptibility phenotype (SIR) based on EUCAST or CLSI guidelines (human breakpoints and/or ECOFF).
Usage
import_pheno(
input,
format = "ebi",
interpret_eucast = FALSE,
interpret_clsi = FALSE,
interpret_ecoff = FALSE,
...
)Arguments
- input
A string representing a dataframe, or a path to an input file, containing the phenotype data in a supported format. These files may be downloaded from public sources such as the EBI AMR web browser, EBI FTP site, or NCBI browser, or using the functions
download_ebi(),download_ncbi_pheno(), orquery_ncbi_bq_geno(); or the files may be exported from supported AST instruments.- format
A string indicating the format of the data:
"ebi"(default),"ebi_web","ebi_ftp","ncbi","ncbi_biosample","vitek","microscan","phoenix","sensititre", or"whonet". This determines which importer function the data is passed on to for processing (see below).- interpret_eucast
A logical value (default is
FALSE). IfTRUE, the function will interpret the susceptibility phenotype (SIR) for each row based on the MIC or disk diffusion values, against EUCAST human breakpoints. These will be reported in a new columnpheno_eucast, of class 'sir'.- interpret_clsi
A logical value (default is
FALSE). IfTRUE, the function will interpret the susceptibility phenotype (SIR) for each row based on the MIC or disk diffusion values, against CLSI human breakpoints. These will be reported in a new columnpheno_clsi, of class 'sir'.- interpret_ecoff
A logical value (default is
FALSE). IfTRUE, the function will interpret the wildtype vs nonwildtype status for each row based on the MIC or disk diffusion values, against epidemiological cut-off (ECOFF) values. These will be reported in a new columnecoff, of class 'sir' and coded asNWT(nonwildtype) orWT(wildtype).- ...
Format-specific arguments. See
"ebi":import_ebi_pheno()"ebi_web":import_ebi_pheno()"ebi_ftp":import_ebi_pheno_ftp()"ncbi":import_ncbi_pheno()"ncbi_biosample":import_ncbi_biosample()"vitek":import_vitek_pheno()"microscan":import_microscan_pheno()"sensititre":import_sensititre_pheno()"phoenix":import_phoenix_pheno()"whonet":import_whonet_pheno()
Value
A data frame with the processed AST data, including additional columns:
id: The sample identifier (character).spp_pheno: The species phenotype, formatted using theAMR::as.mo()function (classmo).drug: The antibiotic used in the test, formatted using theAMR::as.ab()function (classab).mic: The minimum inhibitory concentration (MIC) value, formatted using theAMR::as.mic()function (classmic).disk: The disk diffusion measurement (in mm), formatted using theAMR::as.disk()function (classdisk).method: The AST method (e.g.,"broth dilution","disk diffusion","Etest","agar dilution"). Expected values are based on the NCBI/EBI antibiogram specification (character).platform: The AST platform/instrument (e.g.,"Vitek","Phoenix","Sensititre") (character).guideline: The AST standard recorded in the input file as being used for the AST assay (character).pheno_eucast: The phenotype newly interpreted against EUCAST human breakpoint standards (asS/I/R), based on the MIC or disk diffusion data (classsir).pheno_clsi: The phenotype newly interpreted against CLSI human breakpoint standards (asS/I/R), based on the MIC or disk diffusion data (classsir).ecoff: The phenotype newly interpreted against the ECOFF (asWT/NWT), based on the MIC or disk diffusion data (classsir).pheno_provided: The original phenotype interpretation provided in the input file, formatted usingAMR::as.sir()(classsir).source: The source of each data point (from the publications or bioproject field in the input file, or replaced with a single value passed in as thesourceparameter) (character).
Examples
if (FALSE) { # \dontrun{
# import NCBI data retrieved from Google Cloud, without re-interpreting resistance
head(staph_pheno_ncbi_cloud_raw)
pheno <- import_pheno(staph_pheno_ncbi_cloud_raw, format = "ncbi")
# import NCBI data where biosample column has been renamed to 'id'
head(staph_pheno_ncbi_raw)
import_pheno(staph_pheno_ncbi_raw, "ncbi", sample_col = "id")
# import NCBI data and re-interpret resistance (S/I/R) and WT/NWT (vs ECOFF)
head(ecoli_pheno_raw)
pheno <- import_pheno(ecoli_pheno_raw,
format = "ncbi",
interpret_eucast = TRUE, interpret_ecoff = TRUE
)
# download Klebsiella quasipneumoniae phenotype data from NCBI BioSample
kquasi_raw_ncbi <- download_ncbi_pheno("Klebsiella quasipneumoniae")
head(kquasi_raw_ncbi)
# import the data and interpret against EUCAST breakpoints
pheno <- import_pheno(kquasi_raw_ncbi,
format = "ncbi_biosample",
interpret_eucast = T
)
# download Klebsiella quasipneumoniae phenotype data from EBI
kquasi_raw_ebi <- download_ebi(species = "Klebsiella quasipneumoniae")
head(kquasi_raw_ebi)
# import the data and interpret against ecoff
pheno <- import_pheno(kquasi_raw_ebi,
format = "ebi_ftp",
interpret_ecoff = TRUE
)
# import Vitek data from file, with default parameters
pheno <- import_pheno("vitek_export.tsv",
format = "vitek"
)
# import Vitek data from file
# specify guideline that was used, remove dates, ignore expertized calls
pheno <- import_pheno("vitek_export.tsv",
format = "vitek",
instrument_guideline = "EUCAST 2025",
use_expertized = FALSE,
include_dates = FALSE
)
} # }