From 2008.igem.org

Introduction

Why did we come up with two models? We wondered whether this was the relevant question. Indeed, should not we rather question the choice of a single model? We shall here describe the story of our model, and show why it appeared absolutely essential to us to build this dual approach, where both models interact between themselves and beget constructive and purposeful exchanges with the wet lab.

Why a double model is an absolutely necessary base to work with?

As in the industry, where one is asked to propose various technical solutions while developing a project, we decided to propose two models in the mathematical description process. In fact, with a single mathematical model, the description and results obtained are most often biased, by the assumptions that ground the model.

Furthermore, one does not build a model uniquely so as to transpose biological information to the abstract formalism of mathematicians. A model has to be thought depending on the information one wants to get from it.

Last but not least, the most precise one model is, the more parameters are involved. Straightforward arithmetics make it clear: adding equations seems to be a better interpretation of the reality, but what with adding parameters one looses accuracy. What is the optimum equilibrium?

Hence the need to choose approximations adapted to the information we want to get: what effects do we decide to neglect, what degree of precision do we need? Thus, part of the model lies in the understanding of the choice that have been made. We shall hereby draw the parallel with the history of physics. A first model was built, often called the classical theory. Then, it was discovered that this theory was not sufficient to describe every phenomenon. A new theory, quantum physics, was then developed, where the “old” effects still found their place. In a way, this is the train of thought we wished to follow. For example, let’s consider the FIFO subsystem. In the BOB approach, the key point developed was the sum effect of FlhDC and FliA over FliL, FlgA and FlhB. This aspect appears as well, alongside the description of chemical effects, in APE model. However, it is known that one chooses to use classical physics or quantum physics depending on what he wishes to prove.

We therefore concentrated on choosing two reluctant models that would be complementary, since they fulfill different goals. Both give different purposeful pieces of information about our biological system.

What are the respective goals fulfilled?

The topical question, as far as biological systems are concerned, is that yet there is no existing formalism: the “absolute and irrefutable truth” has not yet been found. For instance, everyone knows how to model gravity on earth as well as on the moon. However, no one has ever listed the way fliL behaved depending on the surrounding environment, because it depends on too many elements: which promoter, which concentrations, which pH, which temperature… Today this list seems endless.

Then, when building a model, one needs to understand where exactly he stands foot on, what are the fundamental hypothesis, and what are the effects on the results obtained. To build a mathematical model, one has to have a concrete and precise idea of what he intends to do with it at the end.

BOB: based on bibliography approach

Due to the time constraints, we needed to get quickly a firm ground on which we could work, so as to be able to understand how our biological system could behave and to give direction to the lab. We then needed a model for which we had an good idea of the parameters involved and that would enable us to understand the dynamics involved, as well as the respective influences of the different genes of the cascade.

That is the reason why we intended to find models in bibliography. This has provided us with coherent values for our parameters. We were then able to make up a suitable reasoning. The most concrete use of this can be seen in the way we put the genes FliL, FlgA and FlhB (faire lien avec l’autre page). In fact, we found an interesting model of these interactions in Shiraz Kalir and Uri Alon article. Whether this could be adapted for our system since the parameters were obtained in conditions that might be different to those we use, was of petty interest! Indeed, what we could use was that FlhDC and FliA influenced in different ways over the three class two genes. This understanding helped us deciding the order of the FIFO genes, since we got firmly established arguments that led us to thinking this was the best order.

Furthermore, this model enabled us to explain which step of the model had little chance to be realized, and which had a greater chance of success. This was of the utmost importance in the strategy of the project. Knowing that we had no infinite time at our disposal, this helped us fix our goals and our priorities.

Last but not least, it is important to understand our thought process. Rather than trying to describe the biological process that occurs in a gene cascade, we acted in an engineer way: gene A begets the expression of gene B. Thus, strange it be seen, finding quantitative parameters in this context enabled us to build qualitative, though useful, reasoning. We had no real interest in checking whether the oscillation period would be one hour or one hour and a half. Nevertheless, we wanted to have an idea how we could biologically ease the oscillations.

APE: A Parameter Estimation Approach

This approach met other demands. In fact, our APE approach was built so as to fit more closely to the biological reality. The goal here was to understand the biological process that occurred, and try to translate it into an exploitable mathematical formalism.

This enables us to go more deeply in the understanding of gene interactions. In fact, every mathematical term can be linked to a biological effect. Even though this reasoning takes more time and more biological data, it enables us to translate every mathematical effect more precisely to a biological input. For example, this model would enable us to play on the strength of a parameter. For example, the next step might be to introduce the effects of transcription, and the ultimate model would add the influence of temperature!

Last but not least, it is essential to understand that this approach would require far more biological data. Then, alongside the mathematical model, we designed experiments that could be carried in order to determine each and every parameter involved in our equations. Unfortunately, these experiments had a huge cost as far as time is concerned, but we believe it to be most essential. This is one of the key interactions between the wet and dry labs we set up. Here, biologists and mathematicians could use their knowledge simultaneously! In fact, the utopia for a bio-modelist would be to have a library of data for the genes he uses at his disposal. The Greeks would have believed it impossible to know the interaction strength of the moon by simply opening a book! Let’s learn from sciences that go back to the Antiquity! We hereby bet that in a not too distant future, this will be the case with synthetic biology. We wished our approach to go beyond our Bacterio’Clock project!

What model should be chosen in which case?

It is not a mystery that the pet hate for a mathematician consists in determining the parameters he wishes to use. As we saw throughout the previous explanations, when one decides to go deeper in his mathematical translation of reality, he automatically adds new parameters. Assuming that for example one gets a 10% error when determining a parameter, what is the error made when he has three times more parameters? We directly understand that there is an optimization question that lies under this phenomenon.

The goal is to optimize the model accuracy. The inputs are:

the “precision” of the model (that is the depth of the phenomena explored), which generally coincides with the number of parameters
the length of the data sets
the error made compared to reality

Akaike (followed by others) made up a criterion that discriminates a model if the error grows, and discriminates it as well if the number of parameters grows. We obtained interesting results applied to our system, but the results in themselves shape the hidden part of the iceberg! We can use these criterions to choose what model is more relevant depending of the data we have at our disposal. The tip of the iceberg stands there: we can choose our model depending on what we intend to do!

Team:Paris/Modeling/History