Computer Laboratory, Cambridge University, William Gates Building,15 JJ Thomson Avenue, Cambridge CB3 0FD, UK

Computer Science Department, University of Pisa, Largo Bruno Pontecorvo, 3, Pisa 56127, ITALY

Abstract

Background

Several diseases, many of which nowadays pandemic, consist of multifactorial pathologies. Paradigmatic examples come from the immune response to pathogens, in which cases the effects of different infections combine together, yielding complex mutual feedback, often a positive one that boosts infection progression in a scenario that can easily become lethal. HIV is one such infection, which weakens the immune system favouring the insurgence of opportunistic infections, amongst which Tuberculosis (TB). The treatment with antiretroviral therapies has shown effective in reducing mortality.

An in-depth understanding of complex systems, like the one consisting of HIV, TB and related therapies, is an open great challenge, on the boundaries of bioinformatics, computational and systems biology.

Results

We present a simplified formalisation of the highly dynamic system consisting of HIV, TB and related therapies, at the cellular level. The progression of the disease (AIDS) depends hence on interactions between viruses, cells, chemokines, the high mutation rate of viruses, the immune response of individuals and the interaction between drugs and infection dynamics.

We first discuss a deterministic model of dual infection (HIV and TB) which is able to capture the long-term dynamics of CD4 T cells, viruses and Tumour Necrosis Factor (TNF). We contrast this model with a stochastic approach which captures intrinsic fluctuations of the biological processes. Furthermore, we also integrate automated reasoning techniques, i.e. probabilistic model checking, in our formal analysis. Beyond numerical simulations, model checking allows general properties (effectiveness of anti-HIV therapies) to be verified against the models by means of an automated procedure. Our work stresses the growing importance and flexibility of model checking techniques in bioinformatics.

In this paper we

Conclusion

We argue that the described methodology suitably supports the study of viral infections in a formal, automated and expressive manner. We envisage a long-term contribution of this kind of approaches to clinical Bioinformatics and Translational Medicine.

Background

Human diseases result from abnormalities in an extremely complex system of molecular processes that are often caused by viral or bacterial infections. In these pathological processes, virtually no molecular entity acts in isolation and complexity is caused by the vast amount of dependencies between molecular and phenotype features. The key player of the human survival is the immune system which is a complex system that can be described as a large network of dynamical agents (cells and signalling molecules). A great challenge for contemporary molecular medicine is the modelling, description and ultimately the comprehension of the multistep and multiscale nature of the immune response to pathogens. An even greater challenge is when viral and bacterial infections occur together in the same patient. TB, which is caused by Mycobacterium tuberculosis, is the most frequent co-infection in patients infected by the human immunodeficiency virus type 1 (HIV-1). TB makes more complicated the development of effective therapies.

Efforts for gaining further insights into the pathological mechanisms and novel therapeutic targets benefit from the integration of genomic, proteomic, metabolomic and environmental information. In this work we follow an integrated approach that relies on several sources, such as phylogeny information, traditionally explored within Bioinformatics, differential equations and stochastic modelling, both largely used in Systems Biology, and a formal reasoning technique, viz. Model Checking, developed in Theoretical Computer Science to assess properties of complex computational systems. Furthermore, the Bioinformatics approach in the study of viral dynamics has also focused on identifying variable regions in genomes and pathogenic islands in bacteria

Mathematical and computer science approaches have shown to be more effective in dissecting the network connectivity of cellular circuits and the corresponding dynamical characteristics. The mathematical description of the variation of biomolecular concentrations as a set of ODEs offers quantitative basis for predicting the behaviour and evolution of the system and for testing non-linearities. An alternative approach is to use stochastic simulation via the Gillespie algorithm which provides an exact algorithmic solution of a set of reactions and a meaningful way to consider the noise

Recently, the observation that biological systems often exhibit interactive and concurrent behavior, similarly to computational concurrent systems, has led to the adoption of formal methods originally developed for the description and analysis of complex software systems in computer science. This abstraction "cell as computation", similar to the "DNA as string" and "protein as labeled graph" abstractions which have originated bioinformatics, has inspired the adoption of model checking methodologies to validate biological complex systems

The aim of this paper is to pipeline bioinformatics and quantitative models of infectious processes and anti-HIV therapies, and then show how model checking techniques can contribute to the interpretation of the

The complexity of HIV infection and anti-HIV therapies

Human immunodeficiency virus type 1 (HIV-1) infection is characterised by the progressive loss of CD4 T cells. Anti-HIV therapies act to eradicate or lower the concentration of the virus from the body and replenish the CD4 T cells reservoir. Infection by most strains of HIV requires interaction with CD4 T cells and a chemokine receptor, either CXCR4 or CCR5. Viral strains often use CCR5 during early stages of HIV-1 and then switches to CXCR4 to enter into the cells. This switch emerges in more than 50% of patients

It takes on an average 10 years to get infected by AIDS after having HIV infection. Some patients died within 2 years after getting infected by HIV, while others remained free of AIDS for more than 15 years. The within-patient evolutionary process of viral sequence mutations during HIV infection has suggested improvements in anti-HIV therapies. Anti-HIV drugs are most effective when taken in a combination of three or more at the same time. This is called combination therapy or HAART (Highly Active Antiretroviral Therapy). Physicians recommend starting the therapies if you are ill because of HIV, or if your CD4 T cells count if low (below 200 cells per microL). HAART combinations usually include two drugs that are nucleoside analogues, and one protease inhibitor. The nucleoside analogues drugs result in targeting the viral reverse transcriptase which codes the viral RNA into the DNA that can be integrated into human cells, so transforming the cell into a factory for building blocks of the virus. The protease inhibitor acts as preventing an infected cell from producing new infectious virus particles.

Bioinformatic links between HIV and TB

Chemokines provide the key link between HIV and TB infection. Resistance to HIV infection has been found to be related to the following mutations. Delta32 CCR5, 190G CCR2 and 744A CX3CR1 and CCL3L1

The maximum likelihood phylogeny under the JTT model of evolution for a set of chemokine receptors amino acid sequences

**The maximum likelihood phylogeny under the JTT model of evolution for a set of chemokine receptors amino acid sequences**.

The disruption caused by the dual infection (HIV and TB) focuses on RANTES which blocks CCR5 and whose expression is upregulated by TNF. Therefore we incorporated TNF in our deterministic model to predict the potential effect of HAART on coinfection of HIV and TB. That evidence would explain the increase in CXCR4 usage with the increase of TNF concentration

Results and Discussion

Although differential equation models have long been used for the immune system and viral infection modelling, they focus on the average behavior of large populations of perfectly mixed, identical individuals. An improved realism is perhaps provided by stochastic simulations, which however are computationally intensive. Given that differential equations and stochastic descriptions have important pros and cons and a good degree of complementarity, a framework based on the implementation of both approaches, although time and resources expensive, appears advisable. Stochastic models, then, support the well developed probabilistic model checking analysis. We have extended a model firstly presented in

Furthermore, we have introduced an abstract representation of the HAART therapy treatment by altering the model's parameters that rule the dynamics of our model according to the known effects of the treatment. We are presenting results of the analysis of the HIV and the opportunistic TB infection dynamics and an associated therapy through differential equations, stochastic modelling and formal reasoning techniques.

HIV and the opportunistic TB infection

Experiments carried out by means the deterministic model are reported in Figure

Viral load, CD4 T and TNF over the course of time

**Viral load, CD4 T and TNF over the course of time**. (a) Time evolution of viral load and CD4 where mutation of V5 leads to V4 at around day t = 900 and opportunistic diseases (TB) appears at around day t = 2900 (b) dynamics of viral load and TNF over time.

Modelling HAART therapy against HIV and TB

The model of HIV and TB co-infection can easily be extended to embrace the effect of a common anti-HIV treatment such as the HAART. Drug therapies such as HAART and Maraviroc, although with different mechanisms, result in decreasing the number of viruses and the death of infected cells. Exploiting the expressivity of the treatment, we model HAART in terms of its effects. For the sake of simplicity this can be done by changing the virus replication rate (

Experiments with HAART therapy

**Experiments with HAART therapy**. Therapy start from (a) day t = 250 to day t = 350 (b) day t = 950 to day t = 1050 (c) day t = 2250 to day t = 2350 and (d) day t = 3150 to day t = 3250.

A stochastic description of HAART, HIV and TB dynamics

Many natural and biological phenomena are intrinsically stochastic and discrete, and they can not always be properly described by means of a deterministic and continuous description. For instance, systemic emergent properties can be sensitive to the local presence of minimal (integer) quantities of molecules

Starting from the deterministic model of HIV, TB and HAART, we have determined a corresponding stochastic model. This has been done via standard transformation from deterministic rates into stochastic ones, according to a fixed reference volume of the model (see, e.g.,

Simulation results are reported in Figure

A limited fragment of the time course of Viruses, CD4 T cells and TB bacteria in the case of (a) no treatment and (b) treatment with HAART

**A limited fragment of the time course of Viruses, CD4 T cells and TB bacteria in the case of (a) no treatment and (b) treatment with HAART**.

Assessing HAART therapy against HIV and TB infection

Observing the quantitative results of the stochastic simulations in Figure

Therapy effects

We focus on two properties that can give a measure of the infection progression.

where

Experimental results

The analysis has been performed using the PRISM model checker (see Methods section). Properties can be validated either by constructing the complete Continuous-time Markov Chain relative to the given model model or by approximating verification through sampling a certain number of possible evolutions. The former can result costly or unfeasible for large models. Adopting the latter we obtained the results reported in Table

Quantitative results from the automated verification of the effects of HAART therapy. While the number of viral replications due to CD4 T cell infection is comparable with and without HAART treatment for the time interval considered, a much stronger probability of a failure of the immune system,

**HIV+TB**

**HIV+TB+HAART**

250

269

0.885

0.429

Automated verification yields quantitative measures of the investigated properties. While the number of viral replications due to CD4 T cell infection is comparable (but without HAART many other infected cells contribute to virus replication), the probability of a failure of the immune system is much stronger without HAART treatment,

Conclusion

We have illustrated the potential benefits of formal methods and quantitative models when applied to the study of viral infections and therapy assessment within a computational bioinformatics approach. We have done this by presenting experiments on a proof-of-concept scenario regarding HIV infection and the relative TB opportunistic infection. We have adopted an integrated approach, combining deterministic and stochastic techniques and illustrating how properties of interests for the study of viral infections can be formalised in a general purpose logic, as the one supported by PRISM. Our work stresses the growing importance and flexibility of model checking techniques in bioinformatics. Noteworthily, the verification of these properties can precisely characterise the numerical results of simulations, and this can be helpful in comparing and assessing different antiviral therapies. In conclusion, the modelling of HIV infection has two important linked benefits.

Methods

In this section we provide further details about the construction of the used models and analysis methodologies.

A deterministic model of HIV strains and TB time evolution

Our work is based on a deterministic model of HIV-1 dynamics, firstly appeared in

As standard, we describe the variations of the quantities of the modelled entities as a set of differential equations. We start considering a pool of immature CD4 T cells, represented by the variable _{U}, and evolve into differentiated, uninfected T-cells, ^{U}. Also, the TNF (_{Z }due to CTLs (_{i}), which we do not detail here (see _{k}) by interacting with the virus strains _{k }at rate _{k}, or die at rate ^{T}, (equation (5)). Note that the infection parameter ^{I}, and also due to the action of CTLs with rate

The stochastic model

The stochastic model has been derived from the deterministic one. As a simplification we consider only two viral strains: R5 viral strain (_{k54}). These two strains are the ones we focus on when observing the time course of infections. As for the deterministic model, we have included sufficient details of interaction among the species to express the properties of interest.

As mentioned, we needed to translate some parameters from deterministic to stochastic ones, accordingly to the reference volume of interest. Other parameters, particularly due to lack of existing data in literature, have been approximated by tuning the model on known macroscopic behaviour. We considered a volume of reaction large enough to contain a statistical reliable number of agents (viruses and cells) and values of extensive quantities are scaled according with the reaction volume considered.

We employ a _{4 }= 50 -_{5}, _{5 }= 50,

Interactions in the stochastic model. The values of the parameters are from literature referred (for sake of clarity we skip the dimensionality which is the standard reported in literature), see also

(01)

_{U }= 100

(02)

^{U}= 0.1

(03)

(04)

_{k5 }= 0.0025

(05)

_{k4 }= 0.0025

(06)

_{k54 }= 0.025

(07)

_{T }= 0.1

(08)

_{I }= 0.8

(09)

_{I }= 0.8

(10)

_{Z }= 0.01

(11)

(12)

(13)

(14)

_{5 }= 2.5

(15)

(16)

_{F }= 0.1

(17)

(18)

_{B }= 10

(19)

(20)

^{F}= 0.001

(21)

where

As far as (the effects of) HAART therapy is concerned, it has been modelled by means of a stochastic triggering event that activates and deactivates the treatment. Activation consists in the modification of interactions (12) and (13), whose rate is downgraded to 1, representing reduced morbidity of the virus, which is one of the main effects of HAART.

Probabilistic Model Checking

Being an introduction to probabilistic model checking beyond the scope of this paper, we refer the interested reader to the cited literature and references therein. Other recent works are

The information encoded in a stochastic model describes, roughly speaking, the states in which the modelled system can find itself and the associated probability of being in a state at a certain time. For many natural phenomena that follow a memory-less probability distribution this amounts to a (Continuous Time) Markov Chain (CTMC). Furthermore, given a stochastic model, several algorithms to simulate the possible quantitative evolutions of the system have been defined, noticeably the Gillespie's algorithm

can be used to represent the set of states in which an "

which, informally speaking, represents all the traces for which

PRISM

The PRISM probabilistic model checker _{k4}. A PRISM model can be used for simulation purposes, and for exact or approximated model checking as illustrated.

The implementation of the stochastic model (Table 2) in the PRISM modelling language

**The implementation of the stochastic model (Table 2) in the PRISM modelling language**.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AS and PL authors have studied the deterministic model of infection used in this paper. PL author has investigated the phylogenetic information. AS and AB authors have defined the stochastic version of the model and run in-silico experiments. All authors have worked on experiments results.

Acknowledgements

The authors wish to thank the anonymous referees for their helpful comments on a previous version of this paper. This project is partially supported by EC IST SOCIALNETS and WADA projects.

This article has been published as part of