Skip to contents

This function imports and processes genotyping results from Kleborate (https://github.com/klebgenomics/Kleborate), extracting antimicrobial resistance determinants and mapping them to standardised drug classes.

Usage

import_kleborate(
  input_table,
  sample_col = "strain",
  kleborate_class_table = kleborate_classes,
  hgvs = TRUE
)

Arguments

input_table

A character string specifying a dataframe or path to the Kleborate results table (TSV format).

sample_col

A character string specifying the column that identifies samples in the dataset (default strain).

kleborate_class_table

A tibble containing a reference table mapping Kleborate drug class column names (Kleborate_Class) to standardised drug classes (drug_class). Defaults to kleborate_classes, which is provided internally.

hgvs

Logical indicating whether to expect mutations in HGVS format (used in Kleborate releases since v3.1.3). Default TRUE, which expects mutations formatted as e.g. "GyrA:p.S83F". Set to FALSE if your results were generated using older versions where mutations were formatted as e.g. "GyrA_83F".

Value

A tibble containing the processed AMR determinants and drug classes that is AMRgen compatible.

Details

The function performs the following steps:

  • Reads the Kleborate output table.

  • Transforms Kleborate output into long form (i.e., one AMR determinant per row).

  • Maps Kleborate drug classes to standardised drug class names. This processing ensures compatibility with downstream AMRgen analysis workflows.

Examples

# example Kleborate data from EUSCAPE project
kleborate_raw
#> # A tibble: 1,689 × 122
#>    strain    species species_match contig_count    N50 largest_contig total_size
#>    <chr>     <chr>   <chr>                <dbl>  <dbl>          <dbl>      <dbl>
#>  1 SAMEA349… Klebsi… strong                 107 270158         668396    5550237
#>  2 SAMEA349… Klebsi… strong                  72 367024        1153387    5328304
#>  3 SAMEA349… Klebsi… strong                 154 290037         723108    5865430
#>  4 SAMEA349… Klebsi… strong                 113 270158         668396    5551355
#>  5 SAMEA349… Klebsi… strong                 100 273060        1309748    5509346
#>  6 SAMEA349… Klebsi… strong                 181 203669         523738    5706273
#>  7 SAMEA349… Klebsi… strong                 193 203666         523738    5602637
#>  8 SAMEA349… Klebsi… strong                  53 383004         704598    5207015
#>  9 SAMEA349… Klebsi… strong                 151 206496         619259    5474980
#> 10 SAMEA349… Klebsi… strong                  84 371555         984007    5600579
#> # ℹ 1,679 more rows
#> # ℹ 115 more variables: GC_content <dbl>, ambiguous_bases <chr>,
#> #   QC_warnings <chr>, ST <chr>, gapA <dbl>, infB <dbl>, mdh <dbl>, pgi <dbl>,
#> #   phoE <dbl>, rpoB <dbl>, tonB <dbl>, YbST <chr>, Yersiniabactin <chr>,
#> #   ybtS <chr>, ybtX <chr>, ybtQ <chr>, ybtP <chr>, ybtA <chr>, irp2 <chr>,
#> #   irp1 <chr>, ybtU <chr>, ybtT <chr>, ybtE <chr>, fyuA <chr>,
#> #   spurious_ybt_hits <chr>, CbST <chr>, Colibactin <chr>, clbA <chr>, …

# import first few rows of this data frame and parse it as AMRfp data
kleborate_geno <- import_kleborate(kleborate_raw %>% head(n = 10), "strain")