Team:Calgary Software

From 2008.igem.org

Revision as of 21:48, 17 June 2008 by Kthachek (Talk | contribs)
The University of Calgary Software Team
Our base of operation - the ICT building

Contents

The Project

EvoGEM is an agent based evolutionary design software developed at the University of Calgary in the summer of 2007. Although it inspired some interest in the 2007 jamboree, the project presented in 2007 was merely the starting point for the University of Calgary. This summer, the goal of this team is to bring EvoGEM to the point where this software is able to develop sophisticated systems which present practical use in synthetic biology.

This goal is composed of improving both the simulation aspect of the software, strengthening its results' credibility, and integrating additional functionalities to the evolutionary algorithm, to allow for a more sophisticated search of the huge space the registry, and synthetic biology in general, pose. We will be integrating chemical structures into the model and interpret them in terms of the systems and synthetic biology of the system.

The Plan

Currently, EvoGEM does not understand or distinguish between the chemical or physical properties of various compounds that roam throughout the system. Thus, we have introduced chemistry into the model. In order to accomplish this, a relevant basis must be explored. Hence we search the registry for an arbitrary part and analyze its sequence to determine which protein it codes for. Once the protein is found, we analyze its enzymatic function (if any) and what reaction is involved. Any chemical compounds associated with the protein are examined and subsequently stored in the InChi and SMILES format. The information is gathered and stored for the system to use as a foundation of understanding what chemicals exist within its environment.

Step I

First off, we will characterize the following sample parts from the registry.

BBa_J45120

BBa_J45200

BBa_J45250

BBa_I0462

BBa_J45996


We translate DNA code into amino acids with this program

e.g.

This DNA sequence:

atggaagttgttgaagttcttcacatgaatggaggaaatggagacagtagctatgcaaacaattctttggttcagcaaaaggtgattctcatgacaaagc caataactgagcaagccatgattgatctctacagcagcctctttccagaaaccttatgcattgcagatttgggttgttctttgggagctaacactttctt ggtggtctcacagcttgttaaaatagtagaaaaagaacgaaaaaagcatggttttaagtctccagagttttattttcacttcaatgatcttcctggcaat gattttaatacactttttcagtcactgggggcatttcaagaagatttgagaaagcatataggggaaagctttggtccatgttttttcagtggagtgcctg gttcattttatactagacttttcccttccaaaagtttacattttgtttactcctcctacagtctcatgtggctatctcaggtgcctaatgggattgaaaa taacaagggaaacatttacatggcaagaacaagccctctaagtgttattaaagcatactacaagcaatatgaaatagatttttcaaattttctcaagtac cgttcagaggaattgatgaaaggtggaaagatggtgttaacactcctaggtagagaaagtgaggatcctactagcaaagaatgctgttacatttgggagc ttctagccatggccctcaataagttggttgaagagggattgataaaagaagagaaagtagatgcattcaatattcctcaatacacaccatcaccagcaga agtaaagtacatagttgagaaggaaggatcattcaccattaatcgcttggaaacatcaagagttcattggaatgcttctaataatgagaagaatggtggt tacaatgtgtcaaggtgcatgagagctgtggctgagcctttgcttgtcagccactttgacaaggaattgatggatttagtgttccacaagtacgaagaga ttgtttctgattgcatgtccaaagagaatactgagtttataaatgtcatcatctccttgaccaaaataaattaa


Translates into this amino acid sequence:

MEVVEVLHMNGGNGDSSYANNSLVQQKVILMTKPITEQAMIDLYSSLFPETLCIADLGCS LGANTFLVVSQLVKIVEKERKKHGFKSPEFYFHFNDLPGNDFNTLFQSLGAFQEDLRKHI GESFGPCFFSGVPGSFYTRLFPSKSLHFVYSSYSLMWLSQVPNGIENNKGNIYMARTSPL SVIKAYYKQYEIDFSNFLKYRSEELMKGGKMVLTLLGRESEDPTSKECCYIWELLAMALN KLVEEGLIKEEKVDAFNIPQYTPSPAEVKYIVEKEGSFTINRLETSRVHWNASNNEKNGG YNVSRCMRAVAEPLLVSHFDKELMDLVFHKYEEIVSDCMSKENTEFINVIISLTKIN-

Step II

We have developed a web browsing script that uses PERL to retrieve relevant information from the parts' registry as well as from other databases such as Uniprot, PubChem and Chemspider.

The final goal is to build a large scale local registry of the same type we made in step I.

The workflow of the software is as follows:

  • Retrieve information about every part in the registry
  • Use that information to characterize promoters, RBS and terminators
  • Retrieve information about produced proteins from different databases and use that to characterize reporters and protein coding regions
  • Retrieve information about reactions from different databases and use that to characterize specific enzymes produced by protein coding regions
  • Retrieve information about reactants and products from different databases to characterize specific compounds related to proteins (substrates, products, inducers, inhibitors etc')
  • Use all that information to build a flat file registry for EvoGEM to use

The following is the format of the file that stores our information.

In parts

[part name]

type = protein coding region

sequence = atggtgcatctatagacgtacgtcatgcagtactactgattattttagctgcacgtcagtacgctaac...taa

output = Heme oxygenase 1


In proteins_and_molecules

[Heme oxygenase 1]

sequence = MVGTRDSSFFT...*

input = InChI=1/C34H34N4O4.Fe/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16- + 3AH(2) + 3O(2) (notice InChI)

output = biliverdin + Fe(2+) + CO + 3 A + 3 H(2)O (notice no InChI)

weight = ... (This should be only a number, and should be in kiloDaltons (kDa))



[Heme]

InChI = InChI=1/C34H34N4O4.Fe/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16- + 3AH(2) + 3O(2)

SMILES = [Fe+2].O=C(O)CCc1c(c3[n-]c1cc/5nc(cc2[n-]c(c(c2C)\C=C)cc/4nc(c3)Cā€‹(\C=C)=C\4C)\C(=C\5CCC(=O)O)C)C

synonyms = ...

weight = ... (This should be only a number, and should be in kiloDaltons (kDa))


fields who don't have their value obtained should not appear

Step III

With all of the data gathered, we run several experiments with EvoGEM and the produced files. Also, further optimizing the parameters in the simulation for the evolution of the required circuits will be taken. We will be implementing an AVL-Tree data structure for storage of the flat file information at run-time for fast, optimized access. While these experiments are running, we will have several properties added to EvoGEM, namely, the inclusion of mRNA and ribosomes objects to the simulation space.




You can write a background of your team here. Give us a background of your team, the members, etc. Or tell us more about something of your choosing.
Example logo.png

Tell us more about your project. Give us background. Use this is the abstract of your project. Be descriptive but concise (1-2 paragraphs)

Your team picture
Team Example 2


Home
The Team The Project Parts Submitted to the Registry Modeling Notebook


(Or you can choose different headings. But you must have a team page, a project page, and a notebook page.)