Team:Paris/Modeling/estimation

From 2008.igem.org

(Difference between revisions)

Revision as of 13:43, 18 August 2008

Estimation of parameters

If we want to use the promoters used for the formation of the flagella (Description of the project), we will have to clearly defined their dynamics. To do so, a rather huge experimental work will be undertaken, consisting in providing the so-called 'Hill functions' for each promoters.

getting a Hill function from convenient datas

Therefore, we have written a little module which can estimate the parameters of the 'Hill functions', even with some noise and few data available. Some details and the corresponding code can be found here : findparam.

The method we have employed is just based on a least-square optimization. Then, it could be generic enough for many applications and we would be glad to share the code if you feel it could be usefull.

getting convenient datas

Thus, we need experimental datas. To quantify the strength of a transcription factor on a promoter, we will use measurements of GFP fluorescence, and compare to the strength of the constitutive promoter [J23101], as it was proposed by the iGEM competition. The datas we are looking for must appear as a table of values, giving several 'transduction rates' with their corresponding 'transcription factor concentrations'.

first hypothesis

For this aim, we made several hypothesis, which we will verify as good as it is possible for us :

(1) We do not take into acount the 'traduction' phase (see however considerations on RBS), so we directly correlate the transcription of a gene with the concentration of its protein.

(2) We assume that, whatever is the gene behind the promoter, its expression depends only of the transcription factor of the promoter, and not, for instance, of the weight of this gene. That's why comparing promoter strength is relevent only if the genes behind have similar length.

(3) We consider that the activity of a promoter is well described as a Hill function of its transcription factor (TF). Thus, we suppose that the protein concentration (Prot) follows this equation :

where gamma is a constant, due to degradation and of dilution of the protein, along time and cell divisions. Therefore, if we consider a steady-state, for given concentration of the transcription factor, we will have :

(4) Endly, knowing γ will give us the kind of datas we are looking for. In a first approach, we assume that, as long as the barcteria are in their phase of exponential growth, the degradation is far smaller than the dilution, and can be omitted. But we will probably discuss that later.

(5) Unless we find further documents dealing with the relation between the intensity of fluorescence and the concentration of GFP, we will directly use the measure in fluorescence, that we will treat as a protein concentration, more or less arbitrary normalised.
Actually, we found a linear relation between GFP mut3B concentration (nM) and fluorescence (au). The conversion factor is 79,429. For more detailed information, see [6].

how to control the concentration of the transcription factor ?

By aTc / TetR / Ptet

Now, we must use as a variable of reference an element that could be introduced in the bacteria, well-controlled, and from which all the concentrations of our transcription factor will depend. We propose a construction in which our transcription factor is put after the promoter Ptet, which is under the repression of TetR. Since aTc is a small diffusive molecule that binds to TetR and inhibits this way the repression of Ptet, we can use it as an 'inducer'. To do so, we must place in the bacterium the gene tetR after a constitutive promoter (like J23101). According to previous hypothesis, this will provide at steady-state a 'constant concentration' of TetR (we note [TetR]_tot, and it is supposed to be the TOTAL concentration of TetR, under every form) in the bacterium. If we consider the binding reaction this way (where aTc_-_TetR denotes the complex)

with a dissociation constant K_aTc, we find at the steady-state

where [aTc] denotes the concentration of aTc we introduced in the medium, that will stay constant in all the bacteria along time, assuming that its degradation is near 0, and that the diffusion is quick.

According to the hypothesis (3), the activity of Ptet would verify (keeping the same notations) :

Then, by considering the complexation

File:TetRPtet.jpg

We find at steady-state :

In the last equation, we will have 'access' (see hypothesis (5) ) to [prot]_eq and possibly to γ_prot, and we are looking for β_tet, K_tet and n_tet, thanks to our program (see getting Hill function with convenient datas). But we need to know (K_aTc[TetR_tot]/(K_aTc + [aTc]^{aTc^{)), and we just have [aTc]. However, we could probably get [TetR_tot], that depends of the constitutive promoter (J23101) we will put before, which we will caracterise (see Team:Paris/Modeling/estimation#what_are_we_looking_for_.3F).
Then, we intend to program a new algorithm, based on the same principles of 'findparam' for a classic hill function, but that is looking for more parameters.}}

By AraC / Arabinose / Pbad

An other well-known promoter called Pbad, is induced by the complex AraC_-_Arabinose, where AraC appears to be a protein constitutively produced by an operon attached to Pbad, and Arabinose is a sugar, that we can add at will in the medium, and diffuses through cells.

Then, our "inducer" will be the Arabinose, that we will treat as an abstract variable without intending to know the concentrations of AraC, or of the complex.

The Problem of the RBS

The promoter is not the only one factor which control the expression of the protein... in particular, the traduction phenomenon is almost as important as the transduction. As we decided not to take into acount the traduction, that means that we do not want to deal with mRNA and ribosomes ; nevertheless, as we aimed in the "hill approach" to simulate as precisely as possible the involved concentrations, we must integrate in the traduction simulation the most important influences of the transduction'.

In particular, since the traduction rate depends near linearly (as we guess ; it gives the affinity between the mRNA and the ribosome !) of the Ribosome Binding Site, it can be modelised in the β coefficient of the promoter (see finding parameters) (even if, rigorously, we must describe such induction by the association "promoter + specific RBS").

Still, as it is proposed by iGEM, we will use the GFPgenerator (E0240) in association with its RBS (E0032), to caracterise the expression of the gene behind promoters. However, the RBS before the genes which codes for the transcriptions factors we want to induce, are the natural RBS (specific respectively to tetR, flhDC, fliA, etc... ). Therefore, we must pay attention on what we are measuring.

what are we looking for ?

The different functions we would like to determine are the followings. They are linked to a basic description of the 'experimental protocol' that will allow us to get the expected datas. We decide to let the original promoters in the bacteria, so that the strength that we are measuring is 'the strength for an additional promoter in the cell', keeping those which already exist : this makes sense for our construction, and probably for most of the constructions of synthetic biology wih E.coli.

[expr.(Pbad)] = ƒ0([Arabinose])

[expr.(Ptet)] = ƒ1(TetR,[aTc])

According to the hypothesis (1) and (2), we assume this will directly give us [Protein] = ƒ0([Arabinose]) and [Protein] = ƒ1([aTc]), for a given Protein coded by a gene put behind the Pbad or Ptet promoter.

[expr.(PflhDC)] = ƒ2([fliA])

[expr.(PflhDC)] = ƒ3([OmpR*]) and [expr.(PflhDC)] = ƒ3bis([EnvZ*])

[expr.(PfliA)] = ƒ4([FlhDC],0) and [expr.(PfliA)] = ƒ4(0,[FliA])

[expr.(PfliL)] = ƒ5([FlhDC],0) and [expr.(PfliL)] = ƒ5(0,[FliA])

[expr.(PflgA)] = ƒ6([FlhDC],0) and [expr.(PflgA)] = ƒ6(0,[FliA])

[expr.(PflgB)] = ƒ7([FlhDC],0) and [expr.(PflgB)] = ƒ7(0,[FliA])

[expr.(PflhB)] = ƒ8([FlhDC],0) and [expr.(PflhB)] = ƒ8(0,[FliA])

@@ Line 46: / Line 46: @@
 ====<center>By  aTc / TetR / ''Ptet'' </center>====
-Now, we must use as a variable of reference an element that could be introduced in the bacteria, well-controlled, and from which all the concentrations of our transcription factor  will depend. We propose a construction in which our transcription factor is put after the promoter ''Ptet'', which is under the repression of TetR. Since aTc is a small diffusive molecule that binds to TetR and inhibits this way the repression of ''Ptet'', we can use it as an 'inducer'. To do so, we must place in the bacterium the gene ''tetR'' after a constitutive promoter (like J23101). According to previous hypothesis, this will provide at steady-state a 'constant concentration' of TetR (we note [TetR*], and it is supposed to be the TOTAL concentration of TetR,  under every form) in the bacterium. If we consider the binding reaction this way (where aTc_-_TetR denotes the complex)
+Now, we must use as a variable of reference an element that could be introduced in the bacteria, well-controlled, and from which all the concentrations of our transcription factor  will depend. We propose a construction in which our transcription factor is put after the promoter ''Ptet'', which is under the repression of TetR. Since aTc is a small diffusive molecule that binds to TetR and inhibits this way the repression of ''Ptet'', we can use it as an 'inducer'. To do so, we must place in the bacterium the gene ''tetR'' after a constitutive promoter (like J23101). According to previous hypothesis, this will provide at steady-state a 'constant concentration' of TetR (we note [TetR]<sub>tot</sub>, and it is supposed to be the TOTAL concentration of TetR, under every form) in the bacterium. If we consider the binding reaction this way (where aTc_-_TetR denotes the complex)
-<center> [[Image:aTcTetR.jpg]]</center>
+<center> [[Image:aTcTetRn.jpg]]</center>
-with a dissociation constant K, we find at the steady-state
+with a dissociation constant K<sub>aTc</sub>, we find at the steady-state
-<center> [[Image:quantTetR.jpg]]</center>
+<center> [[Image:TetRfree.jpg]]</center>
-where [aTc] denotes the concentration of aTc we introduced in the medium, that will stay constant in all the bacteria along time, assuming that its degradation is near 0, and that the diffusion is quick. (We can already notice an other limitation in our protocol : this formula has got sense only for [aTc] < (K + [TetR*]), that limits the range of [aTc] we can use to determine the functions below. Notice that this limitation is conditionned by [TetR*],which depends of the constitutive promoter we put before ''tetR''.)
+where [aTc] denotes the concentration of aTc we introduced in the medium, that will stay constant in all the bacteria along time, assuming that its degradation is near 0, and that the diffusion is quick.
 According to the [[Team:Paris/Modeling#first_Hypothesis|hypothesis '''(3)''']], the activity of ''Ptet'' would verify (keeping the same notations) :
-<center> [[Image:hillTetR.jpg]]</center>
+<center> [[Image:exprTetR.jpg]]</center>
-so, if κ denotes the dissociation constant of (TetR + ''Ptet'' &#8644; TetR_-_''Ptet''),
+Then, by considering the complexation
-<center> [[Image:ActPtet.jpg]] </center>
+<center> [[Image:TetRPtet.jpg]]</center>
-In the last equation, we will have 'access' (see [[Team:Paris/Modeling#first_Hypothesis|hypothesis '''(5)''']]
+We find at steady-state :
-) to Prot and possibly to γ, and we are looking for β, κ and n, thanks to our program (see [[Team:Paris/Modeling#getting_a_Hill_function_from_convenient_datas|getting Hill function with convenient datas]]). But we need to know ([TetR*] - [TetR*].[aTc]/(K + [TetR*])), and we just have [aTc]. However, we can reduce the last  equation to
-<center> [[Image:ActPtetReduced.jpg]]</center>
+<center> [[Image:HillActPtet.jpg]] </center>
-Thus, we have two possibilities :
+In the last equation, we will have 'access' (see [[Team:Paris/Modeling#first_Hypothesis|hypothesis '''(5)''']]
-* we can write a new algorithm that optimise an approaching solution of the new parameters, based on the same principal than [[Team:Paris/Modeling/Programs|'findparam']].
+) to [prot]<sub>eq</sub> and possibly to γ<sub>prot</sub>, and we are looking for β<sub>tet</sub>, K<sub>tet</sub> and n<sub>tet</sub>, thanks to our program (see [[Team:Paris/Modeling#getting_a_Hill_function_from_convenient_datas|getting Hill function with convenient datas]]). But we need to know (K<sub>aTc</sub>[TetR<sub>tot</sub>]/(K<sub>aTc</sub> + [aTc]<sup>aTc<sup>)), and we just have [aTc]. However, we could probably get [TetR<sub>tot</sub>], that depends of the constitutive promoter (J23101) we will put before, which we will caracterise (see [[Team:Paris/Modeling/estimation#what_are_we_looking_for_.3F]]).
+Then, we intend to program a new algorithm, based on the same principles of 'findparam' for a classic ''hill function'', but that is looking for more parameters.
-* better but much longer and requiring much more precision, we can use the already noticed properties : [aTc] < (K + [TetR*]). By having a look on the first equation of this section, we understand that beyond this limit, [TetR_-_aTc] will no more evoluate. By observing the evolution of the influence of a growing (by steps as small as possible) concentration of [aTc] introduced, we should be able to approximate the critic concentration when it no more changes, ~(K + [TetR*]). Less is the order n, better is the detection of this critic concentration, because of the greater derivative of the Hill function for small values of TetR. Therefore we should keep this estimation only if we find n ~ 1. Then, by considering κ*((K + [TetR*])/[TetR*]) instead of κ, we should easily determine all the parameters we need, only thanks to [[Team:Paris/Modeling/Programs|'findparam']].
-* Finally, since the repression of ''Ptet'' by TetR in ''E.Coli'' has been well studied, and also the binding of aTc on TetR, we could probably find the dissociation constant K (see above) of the complexation reaction TetR + aTc &#8644; TetR_-_aTc, in order to find directly by calculus the amount of free TetR, function of the expression of the constitutive promoter before ''tetR'' and of the added aTc.
-Then, it is still interesting to compare the three possibilities, if we have enough time...
 ====<center>By  AraC / Arabinose / ''Pbad'' </center>====