Repository logo

GeneMates: an R package for detecting horizontal gene co-transfer between bacteria using gene-gene associations controlled for population structure

Published version

Change log


Wick, Ryan R. 
Zobel, Justin 
Ingle, Danielle J. 
Inouye, Michael 


Abstract: Background: Horizontal gene transfer contributes to bacterial evolution through mobilising genes across various taxonomical boundaries. It is frequently mediated by mobile genetic elements (MGEs), which may capture, maintain, and rearrange mobile genes and co-mobilise them between bacteria, causing horizontal gene co-transfer (HGcoT). This physical linkage between mobile genes poses a great threat to public health as it facilitates dissemination and co-selection of clinically important genes amongst bacteria. Although rapid accumulation of bacterial whole-genome sequencing data since the 2000s enables study of HGcoT at the population level, results based on genetic co-occurrence counts and simple association tests are usually confounded by bacterial population structure when sampled bacteria belong to the same species, leading to spurious conclusions. Results: We have developed a network approach to explore WGS data for evidence of intraspecies HGcoT and have implemented it in R package GeneMates ( The package takes as input an allelic presence-absence matrix of interested genes and a matrix of core-genome single-nucleotide polymorphisms, performs association tests with linear mixed models controlled for population structure, produces a network of significantly associated alleles, and identifies clusters within the network as plausible co-transferred alleles. GeneMates users may choose to score consistency of allelic physical distances measured in genome assemblies using a novel approach we have developed and overlay scores to the network for further evidence of HGcoT. Validation studies of GeneMates on known acquired antimicrobial resistance genes in Escherichia coli and Salmonella Typhimurium show advantages of our network approach over simple association analysis: (1) distinguishing between allelic co-occurrence driven by HGcoT and that driven by clonal reproduction, (2) evaluating effects of population structure on allelic co-occurrence, and (3) direct links between allele clusters in the network and MGEs when physical distances are incorporated. Conclusion: GeneMates offers an effective approach to detection of intraspecies HGcoT using WGS data.



Software, Prokaryote microbial genomics, Horizontal gene transfer, Acquired genes, Mobile genetic elements, Physical linkage, Population structure, Association analysis, Linear mixed models, Principal components, Network approach, R package

Journal Title

BMC Genomics

Conference Name

Journal ISSN


Volume Title



BioMed Central
Melbourne International Research Scholarship (1)
Bill and Melinda Gates Foundation (1)
Viertel Foundation of Australia (1)