Establishing whole genome sequencing at the core of epidemiological surveillance

Over the last two decades, genome sequencing has become an important tool for understanding and tracking the spread of pathogens. Genomic epidemiology is now a preferred method of surveillance and recent years have seen pathogen sequencing at an unprecedented scale, pushing the underlying technologies to the limit. This has brought major innovations and opportunities to public attention, as well as identifying new research areas. However, major challenges remain in public health settings. These include: incorporating new sequencing technologies and data types for real-time surveillance; developing platforms and nomenclatures for genome-based typing and epidemiology; understanding pathogen evolution and the emergence of virulence and antimicrobial resistance; contextualizing knowledge of clinical microbiology with One Health ecological genomics. In this collection, we bring together recent studies that are establishing pathogen genomics as a major part of contemporary disease control efforts.
Collection Contents
-
-
Pangenomic analyses of antibiotic-resistant Campylobacter jejuni reveal unique lineage distributions and epidemiological associations
Application of whole-genome sequencing (WGS) to characterize foodborne pathogens has advanced our understanding of circulating genotypes and evolutionary relationships. Herein, we used WGS to investigate the genomic epidemiology of Campylobacter jejuni , a leading cause of foodborne disease. Among the 214 strains recovered from patients with gastroenteritis in Michigan, USA, 85 multilocus sequence types (STs) were represented and 135 (63.1 %) were phenotypically resistant to at least one antibiotic. Horizontally acquired antibiotic resistance genes were detected in 128 (59.8 %) strains and the genotypic resistance profiles were mostly consistent with the phenotypes. Core-gene phylogenetic reconstruction identified three sequence clusters that varied in frequency, while a neighbour-net tree detected significant recombination among the genotypes (pairwise homoplasy index P<0.01). Epidemiological analyses revealed that travel was a significant contributor to pangenomic and ST diversity of C. jejuni , while some lineages were unique to rural counties and more commonly possessed clinically important resistance determinants. Variation was also observed in the frequency of lineages over the 4 year period with chicken and cattle specialists predominating. Altogether, these findings highlight the importance of geographically specific factors, recombination and horizontal gene transfer in shaping the population structure of C. jejuni . They also illustrate the usefulness of WGS data for predicting antibiotic susceptibilities and surveillance, which are important for guiding treatment and prevention strategies.
-
-
-
Enteric fever cluster identification in South Africa using genomic surveillance of Salmonella enterica serovar Typhi
The National Institute for Communicable Diseases in South Africa participates in national laboratory-based surveillance for human isolates of Salmonella species. Laboratory analysis includes whole-genome sequencing (WGS) of isolates. We report on WGS-based surveillance of Salmonella enterica serovar Typhi ( Salmonella Typhi) in South Africa from 2020 through 2021. We describe how WGS analysis identified clusters of enteric fever in the Western Cape Province of South Africa and describe the epidemiological investigations associated with these clusters. A total of 206 Salmonella Typhi isolates were received for analysis. Genomic DNA was isolated from bacteria and WGS was performed using Illumina NextSeq technology. WGS data were investigated using multiple bioinformatics tools, including those available at the Centre for Genomic Epidemiology, EnteroBase and Pathogenwatch. Core-genome multilocus sequence typing was used to investigate the phylogeny of isolates and identify clusters. Three major clusters of enteric fever were identified in the Western Cape Province; cluster one (n=11 isolates), cluster two (n=13 isolates), and cluster three (n=14 isolates). To date, no likely source has been identified for any of the clusters. All isolates associated with the clusters, showed the same genotype (4.3.1.1.EA1) and resistome (antimicrobial resistance genes: bla TEM-1B, catA1, sul1, sul2, dfrA7). The implementation of genomic surveillance of Salmonella Typhi in South Africa has enabled rapid detection of clusters indicative of possible outbreaks. Cluster identification allows for targeted epidemiological investigations and a timely, coordinated public health response.
-
-
-
Genomic analysis of the initial dissemination of carbapenem-resistant Klebsiella pneumoniae clones in a tertiary hospital
Carbapenem-resistant Klebsiella pneumoniae is a major cause of hospital-acquired infections and the fastest-growing pathogen in Europe. Carbapenem resistance was detected at the Consorcio Hospital General Universitario de Valencia (CHGUV) in early 2015, and there has been a significant increase in carbapenem-resistant isolates since then. In this study, we collected carbapenem-resistant isolates from this hospital during the period of increase (from 2015 to 2019) and studied how K. pneumoniae carbapenem-resistant isolates emerged and spread in the hospital. A total of 225 isolates were subjected to whole-genome sequencing with Illumina NextSeq. We characterized the isolates by identifying lineages and antimicrobial resistance genes and plasmids, especially those related to reduced carbapenem susceptibility. Our findings show that the initial carbapenem resistance emergence and dissemination at the CHGUV occurred during a short period of 1 year. Furthermore, it was complex, involving six different lineages of types ST307, ST11, ST101 and ST437, different resistance-determinant factors, including OXA-48, NDM-1, NDM-23 and DHA-1, and different plasmids.
-
-
-
Whole-genome sequencing of Shigella for surveillance purposes shows (inter)national relatedness and multidrug resistance in isolates from men who have sex with men
In the Netherlands, more than half of domestic shigellosis cases are among men who have sex with men (MSM), particularly in the Amsterdam region. However, there is limited insight into which Shigella strains circulate in the Netherlands. Our objective was to assess the added value of whole-genome sequencing (WGS)-based surveillance for Shigella . To this end, we determined the relatedness among Shigella spp. isolates from patients in the Amsterdam region, as well as in an international context, including antimicrobial resistance markers, using WGS. The following criteria were used: it should provide insight into (1) clustering of shigellosis cases and the affected population, (2) the extent of admixture of MSM-associated isolates with those from the broader population and (3) the presence of antimicrobial resistance. It should then lead to more opportunities for targeted control measures. For this study, Shigella isolates from three laboratories in the Amsterdam region obtained between February 2019 and October 2021 were subjected to Illumina WGS at the National Institute for Public Health and the Environment (RIVM). Raw data were quality-checked and assembled, the Shigella serotype was determined with ShigaTyper, and antimicrobial resistance markers were detected using ResFinder and PointFinder. For Shigella sonnei , subclades were determined using Mykrobe. Relatedness of isolates, including 21 international reference genomes, was assessed with core genome multilocus sequence typing. In total, 109 isolates were included, of which 27 were from females (25 %) and 66 were from males (61 %), with which the majority (n=48, 73 %) being from MSM. No information on sex was available for the remaining 16 cases. The WGS data for all isolates, comprising 55 S . sonnei , 52 Shigella flexneri , 1 Shigella boydii and 1 Shigella dysenteriae , met the quality criteria. In total, 14 clusters containing 51 isolates (49 %) were identified, with a median cluster size of 2.5 cases (range: 2–15). Nine out of 14 clusters were MSM-associated, and 8 clusters (57 %) were travel-related. Six of the MSM clusters were related to international reference genomes. The prevalence of antimicrobial resistance markers was higher among isolates from MSM than non-MSM patients, particularly for ciprofloxacin (89 vs 33 %) and azithromycin (58 vs 17 %). In conclusion, about half of Shigella spp. patients were part of a cluster, of which a substantial part were related to international reference genomes, particularly among MSM, and a high prevalence of antimicrobial resistance markers was found. These findings indicate widespread international circulation of Shigella spp., particularly among MSM, with multidrug resistance that hampers treatment of patients. Moreover, the results of this study led to the implementation of a national WGS-based laboratory surveillance programme for Shigella spp. that started in April 2022.
-
-
-
Phylogenomic investigation of an outbreak of fluoroquinolone-resistant Salmonella enterica subsp. enterica serovar Paratyphi A in Phnom Penh, Cambodia
In early 2020, the Medical Biology Laboratory of the Pasteur Institute of Cambodia isolated an unusually high number of fluoroquinolone-resistant Salmonella enterica subspecies enterica serovar Paratyphi A strains during its routine bacteriological surveillance activities in Phnom Penh, Cambodia. A public-health investigation was supported by genome sequencing of these Paratyphi A strains to gain insights into the genetic diversity and population structure of a potential outbreak of fluoroquinolone-resistant paratyphoid fever. Comparative genomic and phylodynamic analyses revealed the 2020 strains were descended from a previously described 2013–2015 outbreak of Paratyphi A infections. Our analysis showed sub-lineage 2.3.1 had remained largely susceptible to fluoroquinolone drugs until 2015, but acquired chromosomal resistance to these drugs during six separate events between late 2012 and 2015. The emergence of fluoroquinolone resistance was rapidly followed by the replacement of the original susceptible Paratyphi A population, which led to a dramatic increase of fluoroquinolone-resistant blood-culture-confirmed cases in subsequent years (2016–2020). The rapid acquisition of resistance-conferring mutations in the Paratyphi A population over a 3 year period is suggestive of a strong selective pressure on that population, likely linked with fluoroquinolone use. In turn, emergence of fluoroquinolone resistance has led to increased use of extended-spectrum cephalosporins like ceftriaxone that are becoming the drug of choice for empirical treatment of paratyphoid fever in Cambodia.
-
-
-
Harmonization of whole-genome sequencing for outbreak surveillance of Enterobacteriaceae and Enterococci
Whole-genome sequencing (WGS) is becoming the de facto standard for bacterial typing and outbreak surveillance of resistant bacterial pathogens. However, interoperability for WGS of bacterial outbreaks is poorly understood. We hypothesized that harmonization of WGS for outbreak surveillance is achievable through the use of identical protocols for both data generation and data analysis. A set of 30 bacterial isolates, comprising of various species belonging to the Enterobacteriaceae family and Enterococcus genera, were selected and sequenced using the same protocol on the Illumina MiSeq platform in each individual centre. All generated sequencing data were analysed by one centre using BioNumerics (6.7.3) for (i) genotyping origin of replications and antimicrobial resistance genes, (ii) core-genome multi-locus sequence typing (cgMLST) for Escherichia coli and Klebsiella pneumoniae and whole-genome multi-locus sequencing typing (wgMLST) for all species. Additionally, a split k-mer analysis was performed to determine the number of SNPs between samples. A precision of 99.0% and an accuracy of 99.2% was achieved for genotyping. Based on cgMLST, a discrepant allele was called only in 2/27 and 3/15 comparisons between two genomes, for E. coli and K. pneumoniae, respectively. Based on wgMLST, the number of discrepant alleles ranged from 0 to 7 (average 1.6). For SNPs, this ranged from 0 to 11 SNPs (average 3.4). Furthermore, we demonstrate that using different de novo assemblers to analyse the same dataset introduces up to 150 SNPs, which surpasses most thresholds for bacterial outbreaks. This shows the importance of harmonization of data-processing surveillance of bacterial outbreaks. In summary, multi-centre WGS for bacterial surveillance is achievable, but only if protocols are harmonized.
-
-
-
Application of a strain-level shotgun metagenomics approach on food samples: resolution of the source of a Salmonella food-borne outbreak
Food-borne outbreak investigation currently relies on the time-consuming and challenging bacterial isolation from food, to be able to link food-derived strains to more easily obtained isolates from infected people. When no food isolate can be obtained, the source of the outbreak cannot be unambiguously determined. Shotgun metagenomics approaches applied to the food samples could circumvent this need for isolation from the suspected source, but require downstream strain-level data analysis to be able to accurately link to the human isolate. Until now, this approach has not yet been applied outside research settings to analyse real food-borne outbreak samples. In September 2019, a Salmonella outbreak occurred in a hotel school in Bruges, Belgium, affecting over 200 students and teachers. Following standard procedures, the Belgian National Reference Center for human salmonellosis and the National Reference Laboratory for Salmonella in food and feed used conventional analysis based on isolation, serotyping and MLVA (multilocus variable number tandem repeat analysis) comparison, followed by whole-genome sequencing, to confirm the source of the contamination over 2 weeks after receipt of the sample, which was freshly prepared tartar sauce in a meal cooked at the school. Our team used this outbreak as a case study to deliver a proof of concept for a short-read strain-level shotgun metagenomics approach for source tracking. We received two suspect food samples: the full meal and some freshly made tartar sauce served with this meal, requiring the use of raw eggs. After analysis, we could prove, without isolation, that Salmonella was present in both samples, and we obtained an inferred genome of a Salmonella enterica subsp. enterica serovar Enteritidis that could be linked back to the human isolates of the outbreak in a phylogenetic tree. These metagenomics-derived outbreak strains were separated from sporadic cases as well as from another outbreak circulating in Europe at the same time period. This is, to our knowledge, the first Salmonella food-borne outbreak investigation uniquely linking the food source using a metagenomics approach and this in a fast time frame.
-
-
-
Rapid nanopore-based DNA sequencing protocol of antibiotic-resistant bacteria for use in surveillance and outbreak investigation
Outbreak investigations are essential to control and prevent the dissemination of pathogens. This study developed and validated a complete analysis protocol for faster and more accurate surveillance and outbreak investigations of antibiotic-resistant microbes based on Oxford Nanopore Technologies (ONT) DNA whole-genome sequencing. The protocol was developed using 42 methicillin-resistant Staphylococcus aureus (MRSA) isolates identified from former well-characterized outbreaks. The validation of the protocol was performed using Illumina technology (MiSeq, Illumina). Additionally, a real-time outbreak investigation of six clinical S. aureus isolates was conducted to test the ONT-based protocol. The suggested protocol includes: (1) a 20 h sequencing run; (2) identification of the sequence type (ST); (3) de novo genome assembly; (4) polishing of the draft genomes; and (5) phylogenetic analysis based on SNPs. After the sequencing run, it was possible to identify the ST in 2 h (20 min per isolate). Assemblies were achieved after 4 h (40 min per isolate) while the polishing was carried out in 7 min per isolate (42 min in total). The phylogenetic analysis took 0.6 h to confirm an outbreak. Overall, the developed protocol was able to at least discard an outbreak in 27 h (mean) after the bacterial identification and less than 33 h to confirm it. All these estimated times were calculated considering the average time for six MRSA isolates per sequencing run. During the real-time S. aureus outbreak investigation, the protocol was able to identify two outbreaks in less than 31 h. The suggested protocol enables identification of outbreaks in early stages using a portable and low-cost device along with a streamlined downstream analysis, therefore having the potential to be incorporated in routine surveillance analysis workflows. In addition, further analysis may include identification of virulence and antibiotic resistance genes for improved pathogen characterization.
-
-
-
Genomic surveillance, characterization and intervention of a polymicrobial multidrug-resistant outbreak in critical care
Background. Infections caused by carbapenem-resistant Acinetobacter baumannii (CR-Ab) have become increasingly prevalent in clinical settings and often result in significant morbidity and mortality due to their multidrug resistance (MDR). Here we present an integrated whole-genome sequencing (WGS) response to a persistent CR-Ab outbreak in a Brisbane hospital between 2016–2018.
Methods. A. baumannii, Klebsiella pneumoniae, Serratia marcescens and Pseudomonas aeruginosa isolates were sequenced using the Illumina platform primarily to establish isolate relationships based on core-genome SNPs, MLST and antimicrobial resistance gene profiles. Representative isolates were selected for PacBio sequencing. Environmental metagenomic sequencing with Illumina was used to detect persistence of the outbreak strain in the hospital.
Results. In response to a suspected polymicrobial outbreak between May to August of 2016, 28 CR-Ab (and 21 other MDR Gram-negative bacilli) were collected from Intensive Care Unit and Burns Unit patients and sent for WGS with a 7 day turn-around time in clinical reporting. All CR-Ab were sequence type (ST)1050 (Pasteur ST2) and within 10 SNPs apart, indicative of an ongoing outbreak, and distinct from historical CR-Ab isolates from the same hospital. Possible transmission routes between patients were identified on the basis of CR-Ab and K. pneumoniae SNP profiles. Continued WGS surveillance between 2016 to 2018 enabled suspected outbreak cases to be refuted, but a resurgence of the outbreak CR-Ab mid-2018 in the Burns Unit prompted additional screening. Environmental metagenomic sequencing identified the hospital plumbing as a potential source. Replacement of the plumbing and routine drain maintenance resulted in rapid resolution of the secondary outbreak and significant risk reduction with no discernable transmission in the Burns Unit since.
Conclusion. We implemented a comprehensive WGS and metagenomics investigation that resolved a persistent CR-Ab outbreak in a critical care setting.
-
-
-
Neisseria gonorrhoeae clustering to reveal major European whole-genome-sequencing-based genogroups in association with antimicrobial resistance
Neisseria gonorrhoeae , the bacterium responsible for the sexually transmitted disease gonorrhoea, has shown an extraordinary ability to develop antimicrobial resistance (AMR) to multiple classes of antimicrobials. With no available vaccine, managing N. gonorrhoeae infections demands effective preventive measures, antibiotic treatment and epidemiological surveillance. The latter two are progressively being supported by the generation of whole-genome sequencing (WGS) data on behalf of national and international surveillance programmes. In this context, this study aims to perform N. gonorrhoeae clustering into genogroups based on WGS data, for enhanced prospective laboratory surveillance. Particularly, it aims to identify the major circulating WGS-genogroups in Europe and to establish a relationship between these and AMR. Ultimately, it enriches public databases by contributing with WGS data from Portuguese isolates spanning 15 years of surveillance. A total of 3791 carefully inspected N. gonorrhoeae genomes from isolates collected across Europe were analysed using a gene-by-gene approach (i.e. using cgMLST). Analysis of cluster composition and stability allowed the classification of isolates into a two-step hierarchical genogroup level determined by two allelic distance thresholds revealing cluster stability. Genogroup clustering in general agreed with available N. gonorrhoeae typing methods [i.e. MLST (multilocus sequence typing), NG-MAST ( N. gonorrhoeae multi-antigen sequence typing) and PubMLST core-genome groups], highlighting the predominant genogroups circulating in Europe, and revealed that the vast majority of the genogroups present a dominant AMR profile. Additionally, a non-static gene-by-gene approach combined with a more discriminatory threshold for potential epidemiological linkage enabled us to match data with previous reports on outbreaks or transmission chains. In conclusion, this genogroup assignment allows a comprehensive analysis of N. gonorrhoeae genetic diversity and the identification of the WGS-based genogroups circulating in Europe, while facilitating the assessment (and continuous monitoring) of their frequency, geographical dispersion and potential association with specific AMR signatures. This strategy may benefit public-health actions through the prioritization of genogroups to be controlled, the identification of emerging resistance carriage, and the potential facilitation of data sharing and communication.
-
-
-
Genomic surveillance of Escherichia coli and Klebsiella spp. in hospital sink drains and patients
Bede Constantinides, Kevin K. Chau, T. Phuong Quan, Gillian Rodger, Monique I. Andersson, Katie Jeffery, Sam Lipworth, Hyun S. Gweon, Andy Peniket, Graham Pike, Julian Millo, Mary Byukusenge, Matt Holdaway, Cat Gibbons, Amy J. Mathers, Derrick W. Crook, Timothy E.A. Peto, A. Sarah Walker and Nicole StoesserEscherichia coli and Klebsiella spp. are important human pathogens that cause a wide spectrum of clinical disease. In healthcare settings, sinks and other wastewater sites have been shown to be reservoirs of antimicrobial-resistant E. coli and Klebsiella spp., particularly in the context of outbreaks of resistant strains amongst patients. Without focusing exclusively on resistance markers or a clinical outbreak, we demonstrate that many hospital sink drains are abundantly and persistently colonized with diverse populations of E. coli , Klebsiella pneumoniae and Klebsiella oxytoca , including both antimicrobial-resistant and susceptible strains. Using whole-genome sequencing of 439 isolates, we show that environmental bacterial populations are largely structured by ward and sink, with only a handful of lineages, such as E. coli ST635, being widely distributed, suggesting different prevailing ecologies, which may vary as a result of different inputs and selection pressures. Whole-genome sequencing of 46 contemporaneous patient isolates identified one (2 %; 95 % CI 0.05–11 %) E. coli urine infection-associated isolate with high similarity to a prior sink isolate, suggesting that sinks may contribute to up to 10 % of infections caused by these organisms in patients on the ward over the same timeframe. Using metagenomics from 20 sink-timepoints, we show that sinks also harbour many clinically relevant antimicrobial resistance genes including bla CTX-M, bla SHV and mcr, and may act as niches for the exchange and amplification of these genes. Our study reinforces the potential role of sinks in contributing to Enterobacterales infection and antimicrobial resistance in hospital patients, something that could be amenable to intervention. This article contains data hosted by Microreact.
-
-
-
Use of whole genome sequencing in surveillance for antimicrobial-resistant Shigella sonnei infections acquired from domestic and international sources
More LessShigella species are a major cause of gastroenteritis worldwide, and Shigella sonnei is the most common species isolated within the United States. Previous surveillance work in Pennsylvania documented increased antimicrobial resistance (AMR) in S. sonnei associated with reported illnesses. The present study examined a subset of these isolates by whole genome sequencing (WGS) to determine the relationship between domestic and international isolates, to identify genes that may be useful for identifying specific Global Lineages of S. sonnei and to test the accuracy of WGS for predicting AMR phenotype. A collection of 22 antimicrobial-resistant isolates from patients infected within the United States or while travelling internationally between 2009 and 2014 was chosen for WGS. Phylogenetic analysis revealed both international and domestic isolates were one of two previously defined Global Lineages of S. sonnei , designated Lineage II and Lineage III. Twelve of 17 alleles tested distinguish these two lineages. Lastly, genome analysis was used to identify AMR determinants. Genotypic analysis was concordant with phenotypic resistance for six of eight antibiotic classes. For aminoglycosides and trimethoprim, resistance genes were identified in two and three phenotypically sensitive isolates, respectively. This article contains data hosted by Microreact.
-
-
-
Genomic survey of Clostridium difficile reservoirs in the East of England implicates environmental contamination of wastewater treatment plants by clinical lineages
There is growing evidence that patients with Clostridiumdifficile-associated diarrhoea often acquire their infecting strain before hospital admission. Wastewater is known to be a potential source of surface water that is contaminated with C. difficile spores. Here, we describe a study that used genome sequencing to compare C. difficile isolated from multiple wastewater treatment plants across the East of England and from patients with clinical disease at a major hospital in the same region. We confirmed that C. difficile from 65 patients were highly diverse and that most cases were not linked to other active cases in the hospital. In total, 186 C. difficile isolates were isolated from effluent water obtained from 18 municipal treatment plants at the point of release into the environment. Whole genome comparisons of clinical and environmental isolates demonstrated highly related populations, and confirmed extensive release of toxigenic C. difficile into surface waters. An analysis based on multilocus sequence types (STs) identified 19 distinct STs in the clinical collection and 38 STs in the wastewater collection, with 13 of 44 STs common to both clinical and wastewater collections. Furthermore, we identified five pairs of highly similar isolates (≤2 SNPs different in the core genome) in clinical and wastewater collections. Strategies to control community acquisition should consider the need for bacterial control of treated wastewater.
-