Team:ETH Zurich/Modeling/Genome-Scale Model

From 2008.igem.org

Revision as of 18:22, 26 October 2008 by Luca.Gerosa (Talk | contribs)


Contents

Genome Scale Analysis

In the Restriction Enzymes Analysis modeling section we deal with the analysis of restriction enzymes effects on the genome from the simple point of view of nucleotide sequences and cutting patterns. This is not informative enough when we try to understand if the key principles of reduction and selection at the base of our minimal genome approach are valid in the context of the whole cell response. It is evident that our selection method for smaller genome size strains is based on the assumption that is possible to control growth rate as a function of its genome size. As explained in the Project Overview, we put a selective pressure on the genome size by combining two effects together: the random reduction of the genome size by restriction enzymes cutting and the feeding of a limited amount of thymidine nucleotides on the background of a thymidine auxotrophic strain. In this context, one should also consider the effects that the lost of chromosomal coding regions may have on the physiology of the cell. This scenario needs to be validate using modeling techniques that relate genome content and substrates avaiability with cell physiology, on a system level fashion. Fortunately, in the last ten year huge progress have been achieved in coding our understanding of biological networks into whole cell comprehensive stochiometric models. This model typology is called genome scale modeling and we use the most update genome scale model for our working strain (E.Coli K12 MG1655) in order to answer the following questions:

  • Is it possible to slow the growth of a strain by using a thymidine auxotrophyc strain and limiting thymidine feeding? How are thymidine uptake rates (concentration in the medium) quantitatively related to the growth rate in the auxotrophyc strain background?
  • What is the quantitative effect on growth rate when reducing the genome size of wild type strain (under the assumption of not losing any functionality)?
  • What is the combined effect of thymidine limitation and genome reduction on growth rate? It is possible to identify the best initial and running settings?
  • What are the best restriction enzymes to be used in order to maximize genome reduction and at the same time vitality (growth rate) of thymidine auxotrophyc strains?
  • What is the predicted genome reduction difference if the cell are grown in minimal or very rich medium (in term of nutrients)?
  • What are the differencies if we were to use alternative approaches to our reduction and selection methods such as complete random search or methodical knockout strategy?

These questions are answered below, in the respective sections. As first we introduce the genome scale model concepts, the Flux Balance Analysis theory and in particular the iAF1260 E.Coli Genome Scale Model developed by [http://gcrg.ucsd.edu/|the Palsson's Group at UCSD], that we modified and used. In the following sections we show the results of simulations for the different questions to be answered.

Genome Scale Models, FBA and E.Coli K12 MG1655

Genome scale models (1) are biological network reconstructions that effectively mape genome annotations (ORFs) to biochemical reactions that define the metabolic network specific to a particular organism. They are also called stoichiometric models because encode calculation of quantitative relationships of the reactants and products in the balanced biochemical reactions in the organism. When they are enough informative to cover a consistent part of the known organism functions are called in-silico organisms, bacause it is possible to use them as models to simulate whole cell system responses. Genome scale models are used in combination with Flux Balance Analysis in order to predict the physiology (growth rate, uptake rates, yields etc..) of the in-silico organism depending on variable external nutrient conditions or changes in genome composition (in-silico knockout experiments). In the last decade this modelling technique has proved to give consistent results with experimental data under various hetereogeneous conditions of testing (4). Mathematically, Flux Balance Analysis uses the stoichiometric information in order to predict the metabolic flux distribution under the assumption of maximization of a particular cell process (for example maximizing biomass production). If N is the stochiometric matrix, and V the vector of fluxes, Flux Balance Analysis assumes steady state:

FBAsteadystate.png

And then compute a flux solution according to a maximization (or the correspective minimization) function:

FBAmin.png

Once the metabolic fluxes are known, is possible to calculate physiology parameters. For a complete introduction to these concepts please read this [http://www.nature.com/nbt/web_extras/supp_info/nbt0201_125/info_frame.html| Genome Scale Models introduction].
Focusing on the organism of our interest, the E. coli in silico organims we use is iAF1260(2) and includes the genes involved in central metabolism, amino acid metabolism, nucleotide metabolism, fatty acid and lipid metabolism, carbohydrate assimilation, vitamin and cofactor biosynthesis, energy and redox generation, and macromolecule production (i.e. peptidoglycan, glycogen, RNA, and DNA). In order to give a flavour about the pathways included in the model we post the graphical representation of reactions and metabolits that has been produced for the previous model version (4).


Central Metabolism

Central.jpg
Amino Acid Metabolism

Amino acid.jpg
Cell Membrane

Cell membrane.jpg
Cofactor Biosythesis

Cofactor.jpg
Nucleotide Metabolism

Nucleotide.jpg
Alternative Carbon Sources

Alternate carbon.jpg


Overall the iAF1260 model contains:

  • 2382 elementary reactions.
  • 1668 metabolites.
  • 1261 ORF.


In order to use the pubblished model, modify its structure (for knockouts experiments) and run Flux Balance Analysis we relied on the Cobra Toolbox (3), a free MATLAB ToolBox also published by the [http://gcrg.ucsd.edu/|the Palsson's Group at UCSD]. We are thankful for the sharing of this great work and give them the credits for this useful tool.

Different mediums

As explained in the Project Overview section, the minimal genome, apart that can be not unique, is also very dependent on the enviromental conditions. In particular bacteria have evolved a complex system of regulation in order to express genes depending on the avaiability of one or the other substrate. So, it results intuitive that both the size and the identity of genes in the minimal genome will vary of a certain (probably significant) extent. In order to test this prediction using genome scale models, we prepared two different in-silico medium on which to grow our models. One is a very rich medium that simulates (as guideline) the content of LB medium, apart from the nucleotides that are not present in order to not disturb the selection mechanism. The other is a minimal medium that has glucose as carbon source. Here below we report the composition of our in-silico mediums and their avaiability:

  • Minimal medium: Glucose - 0.8 mMol/(h*gDW); Oxygen - 18.5 mMol/(h*gDW); Thymidine is variable; Cob(II)alamin - 0.0100 mMol/(h*gDW).
  • Rich medium: Glucose 0.8 mMol/(h*gDW), Oxygen 18.5 mMol/(h*gDW), Thymidine is variable, Cob(II)alamin 0.0100 mMol/(h*gDW), Alanine, Aspartic acid, Glutamic acid, Glutamine, Histidine, Glycine, Leucine, Methionine, Proline, Threonine, Tryosine, Arginine, Cystine, Isoleucine, Lysine, Phenylalanine, Serine, Tryptophan, Asparagine, Sulfate, Phosporus (phosphote), Zinc, Cadmium, Mercury, Ammonia are not limited.

Then we performed growth simulations of the wild type strain and obtained maximal growth rate of 0.7367 in minimal medium and 1.1 in rich medium, in good accordance with experimental values. After having assessed this two possible conditions of growth, we can test how the cells will responsed to the different avaibility in nutrients.

Thymidine limitation effects on growth rates

The iAF1260 in-silico model mimics the metabolic network of a wild type E.Coli strain K12 MG1655. In order to simulate what is the cell response when thymidine auxotrophic and under a thymidine limitation, we modified the model by performing an in-silico knockouts of the THyA gene and introducing an external uptake reaction of thymidine. The picture below shows the modifications:

NucleotideThymSmallSigned.jpg

The reaction termed TDMS is the thymidylate synthase reaction and has been removed from the model. The resulting model is non vital (growth rate is zero) for medium that do not contain nucleotides, according to experimental tests. The reaction TMDt2 has been set up to uptake a defined amount of thymidine from the medium. In this way the reaction that needs the nucleotides in order to produce biomass has a positive influx and rightly simulate cell growth.

Genome size effects on growth rates

Growth rates as output of whole cell system behaviour

Comparing different approaches to minimal genome

References

(1) "Thirteen Years of Building Constraint-Based In Silico Models of Escherichia coli" ,Jennifer L. Reed and Bernhard Ø. Palsson, Journal of Bacteriology, May 2003, p. 2692-2699, Vol. 185, No. 9
(2) "A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information", A.M. Feist et al., Molecular Systems Biology 3:121
(3) "Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox", S.A. Becker et al., Nature Protocols, 2007
(4) "In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data.", Edwards JS, Ibarra RU, Palsson BO (2001), Nat Biotechnol, 19: 125–130.