Mutation Model
From 2008.igem.org
(Difference between revisions)
(New page: =ParI Mutation Model= ==Hypothesis== #The mutation efficiency should remain a continuous curve along reproduction, or along time. Here we discrete the mutation rate to be linear function a...) |
|||
(7 intermediate revisions not shown) | |||
Line 1: | Line 1: | ||
- | = | + | =Part I Mutation Model= |
==Hypothesis== | ==Hypothesis== | ||
- | + | *The mutation efficiency from the initial gene to the target gene should remain a continuous curve along reproduction, or along time. Here we discrete the mutation rate to be linear function according to the replication time (), which in fact create several discontiguous point. | |
- | + | *DNA lesion repair rate appears high enough in replication than other cell activities, so that the DNA lesion repair rate can be conformed to background mutations except in replication process. | |
- | + | *The whole yeast is large enough to obey the statistics rules | |
==Model Construction== | ==Model Construction== | ||
+ | *Probability Model <br>For each DNA sequence status, we take down a probability matrix for the coding sequence, which gives out each base appearance possibility on every site. The structure on each site is 5*1 matrix,(P<sub>a</sub>,P<sub>t</sub>,P<sub>c</sub>,P<sub>g</sub>,P<sub>u</sub>), and ΣP=1. Therefore, the whole sequence status is recorded as 5*length matrix.E.g. For the Gal4 DNA, with a length of 2646, one status is a 5*2646 matrix. | ||
+ | *Scanning Model<br>The hotbox appearance rate on a specific site i is calculated as below. | ||
+ | {|align="left" | ||
+ | |[[Image:Peking hotbox search.jpg]] | ||
+ | |}<br><br><br> | ||
+ | :W: a, t; R: a, g; H: a, c, g;<br> | ||
+ | :Hotbox appearrance rate P<sub>H</sub>=P<sub>W</sub><sup>(i-2)</sup>P<sub>R</sub><sup>(i-2)</sup>P<sub>C</sub><sup>(i-2)</sup>P<sub>H</sub><sup>(i-2)</sup>, and here (i) refers to the site i. | ||
+ | *Model on DNA lesion repair<br>As reviewed by the paper, the U:G mismatch DNA lesion will be repaired in three pathways.Lacks of the knowledge on the preference of repair mechanism in yeast, we temporarily assume that mechanisms happen randomly. | ||
+ | {|align="centre" | ||
+ | |[[Image:Peking Repair.jpg]] | ||
+ | |} | ||
+ | {|align="right" | ||
+ | |''VH Odegard, DG Schatz. Targeting of somatic hypermutation. Nat Rev Immunol. 2006 Aug;6(8):573-83'' | ||
+ | |}<br> | ||
+ | *Copy Status distribution along replication | ||
+ | **Back ground mutations happen all the time in a fix rate. | ||
+ | **Hotspot mutation happens in the transcription process on the coding DNA, while the lesion repair happen at the replication. | ||
+ | **The figure below shows the whole process.S<sub>i</sub> is a sequence distribution matrix. | ||
+ | {|align="centre" | ||
+ | |[[Image:Mutation_map1.jpg|500px]] | ||
+ | |} | ||
+ | {|align="centre" | ||
+ | |[[Image:Mutation_map2.jpg|500px]] | ||
+ | |} | ||
+ | :After n times replication, Copy concentration proportion from S<sub>1</sub> to S<sub>n+1</sub> will be spread binomial expression of (1+1)<sub>n</sub>, which is C<sub>n</sub><sup>0</sup>:C<sub>n</sub><sup>1</sup>:<sub>n</sub><sup>2</sup>:...:C<sub>n</sub><sup>n</sup>. | ||
+ | *Result Calculation<br>To find a probability from the initial gene to the target gene. Consider the DNA of Gal4 with a length of 2646, the target gene is a<sub>1</sub>a<sub>2</sub>a<sub>3</sub>a...a<sub>2545</sub>a<sub>2646</sub>.(Coding sequence) | ||
+ | {|align="left" | ||
+ | |[[Image:Peking_ResultFormula.jpg|300px]] | ||
+ | |} | ||
+ | <br><br><br><br><br><br><br> | ||
+ | :P<sub>a</sub><sup>(j)</sup> is the probability of the base a on the site j, and (P<sub>a<sub>1</sub></sub><sup>(1)</sup>*P<sub></sub><sup></sup>**P<sub></sub><sup></sup>*...P<sub></sub><sup></sup>**P<sub></sub><sup></sup>*)<sub>i</sub> means the calculation is based on the S<sub>i</sub> status matrix. | ||
+ | |||
+ | {|align="right" | ||
+ | |[[Team:Peking_University/Modeling|Previous Page]] | ||
+ | |} |
Latest revision as of 03:28, 30 October 2008
Part I Mutation Model
Hypothesis
- The mutation efficiency from the initial gene to the target gene should remain a continuous curve along reproduction, or along time. Here we discrete the mutation rate to be linear function according to the replication time (), which in fact create several discontiguous point.
- DNA lesion repair rate appears high enough in replication than other cell activities, so that the DNA lesion repair rate can be conformed to background mutations except in replication process.
- The whole yeast is large enough to obey the statistics rules
Model Construction
- Probability Model
For each DNA sequence status, we take down a probability matrix for the coding sequence, which gives out each base appearance possibility on every site. The structure on each site is 5*1 matrix,(Pa,Pt,Pc,Pg,Pu), and ΣP=1. Therefore, the whole sequence status is recorded as 5*length matrix.E.g. For the Gal4 DNA, with a length of 2646, one status is a 5*2646 matrix. - Scanning Model
The hotbox appearance rate on a specific site i is calculated as below.
- W: a, t; R: a, g; H: a, c, g;
- Hotbox appearrance rate PH=PW(i-2)PR(i-2)PC(i-2)PH(i-2), and here (i) refers to the site i.
- Model on DNA lesion repair
As reviewed by the paper, the U:G mismatch DNA lesion will be repaired in three pathways.Lacks of the knowledge on the preference of repair mechanism in yeast, we temporarily assume that mechanisms happen randomly.
VH Odegard, DG Schatz. Targeting of somatic hypermutation. Nat Rev Immunol. 2006 Aug;6(8):573-83 |
- Copy Status distribution along replication
- Back ground mutations happen all the time in a fix rate.
- Hotspot mutation happens in the transcription process on the coding DNA, while the lesion repair happen at the replication.
- The figure below shows the whole process.Si is a sequence distribution matrix.
- After n times replication, Copy concentration proportion from S1 to Sn+1 will be spread binomial expression of (1+1)n, which is Cn0:Cn1:n2:...:Cnn.
- Result Calculation
To find a probability from the initial gene to the target gene. Consider the DNA of Gal4 with a length of 2646, the target gene is a1a2a3a...a2545a2646.(Coding sequence)
- Pa(j) is the probability of the base a on the site j, and (Pa1(1)*P**P*...P**P*)i means the calculation is based on the Si status matrix.
Previous Page |