Repository logo
 

Phylogenetic approaches for quantifying the genotypic diversity of influenza viruses


Type

Thesis

Change log

Authors

Parker, Edyth 

Abstract

Wild aquatic birds are thought to be the main reservoir for Influenza A viruses, hosting the largest burden of influenza diversity generated by high frequencies of co-infection and segment reassortment. The recently emerged avian influenza viruses that spilled-over into the human population were all produced by dynamic reassortment between wild bird viruses and poultryadapted H9N2 viruses. The recruitment of poultry-adapted internal gene cassettes (i.e. the polymerase, nucleoprotein, matrix and non-structural genes) has been shown to increase the fitness of wild-bird viruses in domestic poultry populations. It is unclear whether acquisition of poultry-adapted genes by wild-bird viruses increases the probability of human infection by increased fitness in domestic poultry and associated increased transmission risk at the humananimal interface or whether it mediates improved adaptation to mammalian hosts. It is also unclear whether the recent H9N2 genotype that emerged in human infections of H7N9, H10N8 and H5N6 is the only major set of internal genes prevalently facilitating the genesis of novel reassortants of pandemic concern, or whether there are similar internal genotypes circulating. There is therefore a need to characterize the observed and unobserved genotypic diversity of the internal genes of avian influenza viruses across the global influenza ecosystem. However, this effort has been limited by the lack of a pansubtypic nomenclature to partition and describe the complex lineage distribution of the internal genes across all HA/NA-defined subtypes resulting from dynamic reassortment. The ecological and evolutionary processes that structure reassortment dynamics have also been incompletely investigated across subtypes and reservoir and non-reservoir hosts, with questions remaining regarding constraints on reassortment frequency and co-segregation bias for segments in wild birds relative to domestic birds and swine populations. The current work set out to address these questions, centrally depending on the development of an internal gene genotyping framework based on phylogenetic clustering of the respective internal gene phylogenies. A new phylogenetic clustering algorithm, PhyCLIP, was developed and described in Chapter 2 to overcome current methods’ limiting reliance on arbitrary genetic distance thresholds for cluster definition. PhyCLIP operates on the distribution of all branch lengths in the phylogeny, using this global patristic distance distribution as a pseudo-null distribution to test the within-cluster distance distribution of putative clusters against. PhyCLIP was validated on the WHO H5Nx clade nomenclature to identify evolutionarily informative clusters in viral phylogenies. PhyCLIP was applied to develop a pansubtypic genotyping nomenclature for the internal genes of avian influenza in Chapter Three in a globally representative dataset of n=14 428 sequences and n=120 subtypes. The system designated 4763 genotypes, with their diversity quantified across spatiotemporal, host and subtype scales. Genotypic diversity was significantly unevenly distributed, with wild birds in North America accounting for 45% of all designated genotypes and subtypes H4N6 and H3N8 for 11% each. Approximately 69% of the genotypes were singletons, reflecting the high reassortment frequency in the natural reservoir. The evolutionary pathways generating genotypes infecting humans was also described using the lineage-assignment of the new system, allowing for more complete tracing of progenitor genotypes across subtypes and identification of lineage distinctions between human viruses. Chapter 4 quantified reassortment frequencies in the pansubtypic dataset of avian and swine influenza with a new algorithm, DeviantChild. DeviantChild quantifies phylogenetic incongruency as a measure of reassortment frequency based on PhyCLIP’s phylogenetic clustering and patristic distance distributional shift testing. DeviantChild detected extensive reassortment in the avian influenza dataset and no strong evidence of reassortment bias among segments on a population scale, supporting evidence of predominantly free reassortment of the internal genes. Reassortment was twice and three times as high in the natural reservoir Anseriformes hosts relative to shorebirds and domestic gallinaceous poultry, with the lowest reassortment frequencies reported for H5Nx, H7Nx and H9Nx viruses in gallinaceous poultry. There was evidence of segregated gene flow for the H13 and H16 subtypes in shorebird populations, supported by evidence from the genotypic diversity distribution of gull-restricted genotypes. Chapter five used the genotype distribution and a comprehensive suite of diversity measurements, accounting for sampling heterogeneity to characterise the patterns of unobserved diversity across HA-NA subtype, geographic region and host order. It identified a subset of low pathogenic viruses including H4N6, H3N8, H1N1, H6N1 and H6N2 estimated to have very high levels of undetected diversity in wild bird hosts. It also identified wild birds in China, Guatemala and Japan as major sources of undersampled diversity, as well as domestic poultry in Bangladesh, Pakistan and H5N2 in the USA.

Description

Date

2020-09-29

Advisors

Wood, James
Russell, Colin

Keywords

Influenza, Genomic epidemiology, Evolutionary biology, Phylogenetics

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge