Team:Calgary Software/Project

From 2008.igem.org

Revision as of 19:10, 10 June 2008 by Boris (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

This is a template page. READ THESE INSTRUCTIONS.

You are provided with this team page template with which to start the iGEM season. You may choose to personalize it to fit your team but keep the same "look." Or you may choose to take your team wiki to a different level and design your own wiki. You can find some examples HERE.

You MUST have a team description page, a project abstract, a complete project description, and a lab notebook. PLEASE keep all of your pages within your Team:Example namespace.

You can write a background of your team here. Give us a background of your team, the members, etc. Or tell us more about something of your choosing.
Tell us more about your project. Give us background. Use this is the abstract of your project. Be descriptive but concise (1-2 paragraphs)	Your team picture
	Team Example 2

The Project

EvoGEM is an agent based system which seeks to design iGEM "genetic circuits" by harnessing the power of evolutionary design. In this design methodology, the paradigm of evolution is harnessed in order to select for efficient designs. The survival criteria are determined by the purpose for which these circuits are built, and those that fulfill that purpose well survive and create even better offspring while those that fail to please the criteria, are thrown away from the gene pool. The powerful agent based logic which allows for minimal pre-assumptions about the overall behavior of the system, along with the empirically proven evolutionary design create superb system that is able to both emulate and develop iGEM circuits.

EvoGEM was briefly presented during the 2007 iGEM jamboree and has sparked quite a lot of interest amongst the different teams. This summer, our team plans to further develop the fitness function EvoGEM employs, introduce more complex pattern recognition, and test the system under a much larger search space than before. The final goal is to produce a system sophisticated enough to rebuild working designs from previous years' teams' projects, as well as intelligent enough to simulate successes and failures of working and non-working systems, respectively.

The main focus of this project is to build perl scripts that will support EvoGEM's requirements of a flat file registry, and develop the EvoGEM code to include the behaviors specified before.

Project Details

The focus of the first part of the project is to create a perl script to quarry the registry and other databases in order to retrieve critical information about different bio-bricks. After that, the goal will be to improve the EvoGEM code to introduce the changes discussed in the previous section

Step I

First off, we will characterize the parts that Kent has pulled from the registry as ideas to simulate.

The systems Kent suggested are linked here:

[http://partsregistry.org/wiki/index.php?title=Part:BBa_J45120 BBa_J45120]

[http://partsregistry.org/wiki/index.php?title=Part:BBa_J45200 BBa_J45200]

[http://partsregistry.org/wiki/index.php?title=Part:BBa_J45250 BBa_J45250]

[http://partsregistry.org/wiki/index.php?title=Part:BBa_I0462 BBa_I0462]

[http://partsregistry.org/wiki/index.php?title=Part:BBa_J45996 BBa_J45996]

Once these are simulated well we can begin inserting more systems to increase search space AND try and find more complex pathways

When characterizing proteins, include a field in the parts file titled "sequence = " and the protein sequence using ONE LETTER AMINO ACID notation

Translate DNA code into amino acids with [http://www.vivo.colostate.edu/molkit/translate/index.html this program]

e.g.

This DNA sequence:

atggaagttgttgaagttcttcacatgaatggaggaaatggagacagtagctatgcaaacaattctttggttcagcaaaaggtgattctcatgacaaagc caataactgagcaagccatgattgatctctacagcagcctctttccagaaaccttatgcattgcagatttgggttgttctttgggagctaacactttctt ggtggtctcacagcttgttaaaatagtagaaaaagaacgaaaaaagcatggttttaagtctccagagttttattttcacttcaatgatcttcctggcaat gattttaatacactttttcagtcactgggggcatttcaagaagatttgagaaagcatataggggaaagctttggtccatgttttttcagtggagtgcctg gttcattttatactagacttttcccttccaaaagtttacattttgtttactcctcctacagtctcatgtggctatctcaggtgcctaatgggattgaaaa taacaagggaaacatttacatggcaagaacaagccctctaagtgttattaaagcatactacaagcaatatgaaatagatttttcaaattttctcaagtac cgttcagaggaattgatgaaaggtggaaagatggtgttaacactcctaggtagagaaagtgaggatcctactagcaaagaatgctgttacatttgggagc ttctagccatggccctcaataagttggttgaagagggattgataaaagaagagaaagtagatgcattcaatattcctcaatacacaccatcaccagcaga agtaaagtacatagttgagaaggaaggatcattcaccattaatcgcttggaaacatcaagagttcattggaatgcttctaataatgagaagaatggtggt tacaatgtgtcaaggtgcatgagagctgtggctgagcctttgcttgtcagccactttgacaaggaattgatggatttagtgttccacaagtacgaagaga ttgtttctgattgcatgtccaaagagaatactgagtttataaatgtcatcatctccttgaccaaaataaattaa

Translates into this amino acid sequence:

MEVVEVLHMNGGNGDSSYANNSLVQQKVILMTKPITEQAMIDLYSSLFPETLCIADLGCS LGANTFLVVSQLVKIVEKERKKHGFKSPEFYFHFNDLPGNDFNTLFQSLGAFQEDLRKHI GESFGPCFFSGVPGSFYTRLFPSKSLHFVYSSYSLMWLSQVPNGIENNKGNIYMARTSPL SVIKAYYKQYEIDFSNFLKYRSEELMKGGKMVLTLLGRESEDPTSKECCYIWELLAMALN KLVEEGLIKEEKVDAFNIPQYTPSPAEVKYIVEKEGSFTINRLETSRVHWNASNNEKNGG YNVSRCMRAVAEPLLVSHFDKELMDLVFHKYEEIVSDCMSKENTEFINVIISLTKIN-

Step II

We will be developing a web browsing script that uses [http://www.comp.leeds.ac.uk/Perl/start.html PERL] to retrieve relevant information from [http://partsregistry.org/Main_Page the parts' registry] as well as from other databases such as [http://www.pir.uniprot.org/ Uniprot], [http://pubchem.ncbi.nlm.nih.gov/ PubChem] and [http://www.chemspider.com/ Chemspider]. Take the time to at least glimps at these databases and at PERL.

Here is a nice tutorial on [http://www.perl.com/pub/a/2002/08/20/perlandlwp.html web-browsing using PERL]

The final goal will be to use PERL to be able to build a large scale local registry of the same type we made in step I.

The workflow of the software is as follows:

Retrieve information about every part in the registry
Use that information to characterize promoters, RBS and terminators
Retrieve information about produced proteins from different databases and use that to characterize reporters and protein coding regions
Retrieve information about reactions from different databases and use that to characterize specific enzymes produced by protein coding regions
Retrieve information about reactants and products from different databases to characterize specific compounds related to proteins (substrates, products, inducers, inhibitors etc')
Use all that information to build a flat file registry for EvoGEM to use

The first thing to do tomorrow is to make sure everyone will have PERL and bioPERL on all machines.

The division of roles is as follows:

Boris - Build a module that retrieves all the parts from the registry and characterizes promoters, RBS and terminators - This will be stored in 4 arrays: @final_parts_names, @final_parts_types, @final_parts_parts (if it is a composite parts, this is a breakdown of its subparts, if it is not then is contains the string "NOTHING"), @final_parts_sequences.

Taras - Build a module that accepts a DNA sequence, converts it into an amino acid sequence (a hash table that matches codons to amino acids should be useful), searches using that sequence in a protein database and retrieves protein name, and if it is an enzyme, its reaction. You should produce 4 arrays: @final_part_products (stores what protein each part produces, if a part like an RBS, produces nothing the array element will be the string "NOTHING"), @final_protein_names (a list of all protein names), @final_protein_inputs (a list of inputs for all proteins that are enzymes, if they are not enzymes their input is "NOTHING") and @final_protein_outputs.

Neven - Build a module that accepts a chemical substance's name and uses a chemical database to convert the chemical names into InChI strings, it then retrieves from the database the InChI, SMILES and mass (in kDa) for that molecule, any field that cannot be found should be "NOTHING".

Josh - Build a module that accepts all that information and constructs 2 files ("parts", "proteins_and_molecules") with the following fields (just an example):

In parts

[part name]

type = protein coding region

sequence = atggtgcatctatagacgtacgtcatgcagtactactgattattttagctgcacgtcagtacgctaac...taa

output = Heme oxygenase 1

In proteins_and_molecules

[Heme oxygenase 1]

sequence = MVGTRDSSFFT...*

input = InChI=1/C34H34N4O4.Fe/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16- + 3AH(2) + 3O(2) (notice InChI)

output = biliverdin + Fe(2+) + CO + 3 A + 3 H(2)O (notice no InChI)

weight = ... (This should be only a number, and should be in kiloDaltons (kDa))

[Heme]

InChI = InChI=1/C34H34N4O4.Fe/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16- + 3AH(2) + 3O(2)

SMILES = [Fe+2].O=C(O)CCc1c(c3[n-]c1cc/5nc(cc2[n-]c(c(c2C)\C=C)cc/4nc(c3)C(\C=C)=C\4C)\C(=C\5CCC(=O)O)C)C

synonyms = ...

weight = ... (This should be only a number, and should be in kiloDaltons (kDa))

fields who don't have their value obtained should not appear

Kent - Maintain the UofC iGEM software team page

Meanwhile, Terrance will be responsible for obtaining MathLab for our team for the purposes of designing a network system wrap-around for the simulation environment, Kent will be responsible for developing a bridge between MathLab and EvoGEM.

Since RE (Regular Expressions) will be quite a big portion of this kind of search, here is a [http://www.comp.leeds.ac.uk/Perl/matching.html link to RE in PERL] and here is [http://www.troubleshooters.com/codecorn/littperl/perlreg.htm another one].

Here is another useful link about [http://www.troubleshooters.com/codecorn/littperl/perlsub.htm subroutines in PERL]

In case the LWP is not present on your computer [http://search.cpan.org/~gaas/libwww-perl-5.800/lib/LWP.pm here is a link] where you can download it

Also, I emailed everyone the "Perl goodies" Vlad has sent out to the emails you posted on this wiki.

Step III

The next step once we have a script that performs all the required functions will be to run several experiments with EvoGEM and the produced files. Also, further optimizing the parameters in the simulation for the evolution of the required circuits will be taken.

while these experiments are running, we will have several properties added to EvoGEM, namely, the inclusion of mRNA and ribosomes objects to the simulation space.

Neven, Josh and Taras - you're in charge of adding the ribosome class to EvoGEM. The first step will probably be just a simple outline of the class and some of its properties and methods. By Wednesday you should have a basic understanding of ribosomes, ribosome binding sites, mRNA and tRNA. You don't need to know every little detail but the main idea of translation should be familiar. Wikipedia should be useful.

As you may have guessed, I ask that you know this by Wednesday because you will be giving a presentation on this topic as well as presenting your basic class outline (just the idea of what the ribosome class can and can't do, you don't need to have a bullet proof implementation by then)

You can find a copy of EvoGEM on chuck in the drop boxes

Kent - I'll need you to get off the AVL tree concept for a while and add an mRNA class to EvoGEM. Again, by Wednesday you should have a basic concept of what mRNA is, how it works, and how your class will work in relation to that. Be ready to give a 5-7 minute talk about mRNA and the mRNA class.

You can find a copy of EvoGEM on beagle in your drop box

While all of this is going on, feel free to ask me any questions about EvoGEM and the VIGO::3D environment, since you don't have much experience with it (looking at the code for some of the examples is very useful though since Ian documented those fairly well for those purposes exactly).

*If anyone wants to run through a practice presentation, let me know and I'll be more than happy to listen

*If you wish to use keynote but it is not installed on your computer, for now download a free iWork trial version, once Christian comes back we can make sure all the machines have that

The Experiments

Part 3

Results

The Project

Home	The Team	Parts Submitted to the Registry	Modeling	Notebook