Mutation Model

From 2008.igem.org

(Difference between revisions)
Line 18: Line 18:
{|align="right"
{|align="right"
|''VH Odegard, DG Schatz. Targeting of somatic hypermutation. Nat Rev Immunol. 2006 Aug;6(8):573-83''
|''VH Odegard, DG Schatz. Targeting of somatic hypermutation. Nat Rev Immunol. 2006 Aug;6(8):573-83''
-
|}
+
|}<br>
*Copy Status distribution along replication
*Copy Status distribution along replication
**Back ground mutations happen all the time in a fix rate.  
**Back ground mutations happen all the time in a fix rate.  
Line 30: Line 30:
|}
|}
:After n times replication, Copy concentration proportion from S<sub>1</sub> to S<sub>n+1</sub> will be spread binomial expression of (1+1)<sub>n</sub>, which is C<sub>n</sub><sup>0</sup>:C<sub>n</sub><sup>1</sup>:<sub>n</sub><sup>2</sup>:...:C<sub>n</sub><sup>n</sup>.
:After n times replication, Copy concentration proportion from S<sub>1</sub> to S<sub>n+1</sub> will be spread binomial expression of (1+1)<sub>n</sub>, which is C<sub>n</sub><sup>0</sup>:C<sub>n</sub><sup>1</sup>:<sub>n</sub><sup>2</sup>:...:C<sub>n</sub><sup>n</sup>.
 +
*Result Calculation<br>To find a probability from the initial gene to the target gene. Consider the DNA of Gal4 with a length of 2646, the target gene is a<sub>1</sub>a<sub>2</sub>a<sub>3</sub>a...a<sub>2545</sub>a<sub>2646</sub>.(Coding sequence)
 +
{|align="left"
 +
|[[Image:Peking_ResultFormula.jpg|900px‎]]
 +
|}
 +
<br><br><br><br><br><br><br>
 +
:P<sub>a</sub><sup>(j)</sup> is the probability of the base a on the site j, and (P<sub>a<sub>1</sub></sub><sup>(1)</sup>*P<sub></sub><sup></sup>**P<sub></sub><sup></sup>*...P<sub></sub><sup></sup>**P<sub></sub><sup></sup>*)<sub>i</sub> means the calculation is based on the S<sub>i</sub> status matrix.

Revision as of 18:51, 29 October 2008

Part I Mutation Model

Hypothesis

  • The mutation efficiency from the initial gene to the target gene should remain a continuous curve along reproduction, or along time. Here we discrete the mutation rate to be linear function according to the replication time (), which in fact create several discontiguous point.
  • DNA lesion repair rate appears high enough in replication than other cell activities, so that the DNA lesion repair rate can be conformed to background mutations except in replication process.
  • The whole yeast is large enough to obey the statistics rules

Model Construction

  • Probability Model
    For each DNA sequence status, we take down a probability matrix for the coding sequence, which gives out each base appearance possibility on every site. The structure on each site is 5*1 matrix,(Pa,Pt,Pc,Pg,Pu), and ΣP=1. Therefore, the whole sequence status is recorded as 5*length matrix.E.g. For the Gal4 DNA, with a length of 2646, one status is a 5*2646 matrix.
  • Scanning Model
    The hotbox appearance rate on a specific site i is calculated as below.
Peking hotbox search.jpg



W: a, t; R: a, g; H: a, c, g;
Hotbox appearrance rate PH=PW(i-2)PR(i-2)PC(i-2)PH(i-2), and here (i) refers to the site i.
  • Model on DNA lesion repair
    As reviewed by the paper, the U:G mismatch DNA lesion will be repaired in three pathways.Lacks of the knowledge on the preference of repair mechanism in yeast, we temporarily assume that mechanisms happen randomly.
Peking Repair.jpg
VH Odegard, DG Schatz. Targeting of somatic hypermutation. Nat Rev Immunol. 2006 Aug;6(8):573-83

  • Copy Status distribution along replication
    • Back ground mutations happen all the time in a fix rate.
    • Hotspot mutation happens in the transcription process on the coding DNA, while the lesion repair happen at the replication.
    • The figure below shows the whole process.Si is a sequence distribution matrix.
Mutation map1.jpg
Mutation map2.jpg
After n times replication, Copy concentration proportion from S1 to Sn+1 will be spread binomial expression of (1+1)n, which is Cn0:Cn1:n2:...:Cnn.
  • Result Calculation
    To find a probability from the initial gene to the target gene. Consider the DNA of Gal4 with a length of 2646, the target gene is a1a2a3a...a2545a2646.(Coding sequence)
900px‎








Pa(j) is the probability of the base a on the site j, and (Pa1(1)*P**P*...P**P*)i means the calculation is based on the Si status matrix.