Data Retrieval and Storage
From 2008.igem.org
(→The Algorithm) |
(→Perl) |
||
Line 23: | Line 23: | ||
[[Image:Perl_logo.PNG||thumb|right|160px]] | [[Image:Perl_logo.PNG||thumb|right|160px]] | ||
- | <div align=justify>Our first challenge was finding a way to expand the database for EvoGEM. Last year, EvoGEM only had a small database of BioBrick parts, all of which were added manually. Since the iGEM registry consisted of hundreds of parts, manually adding parts was not practical. In addition, more parts were needed to make more sophisticated tests with EvoGEM. Also, we wanted to have some way of comparing the retrieved parts. We | + | <div align=justify>Our first challenge was finding a way to expand the database for EvoGEM. Last year, EvoGEM only had a small database of BioBrick parts, all of which were added manually. Since the iGEM registry consisted of hundreds of parts, manually adding parts was not practical. In addition, more parts were needed to make more sophisticated tests with EvoGEM. Also, we wanted to have some way of comparing the retrieved parts. We needed to answer the following questions about each part: |
- | * If | + | * If it is an enzyme, what reactions does it catalyze? |
- | * If | + | * If it is a molecule, what is its molecular structure, and what are the synonyms for the molecule name? |
- | The answers to these questions would allow EvoGEM to distinguish between different compounds | + | The answers to these questions would allow EvoGEM to better distinguish between different compounds. How do we accomplish this, though? By creating a Perl script! </div> |
- | <div align=justify>Perl is a programming language that is powerful in text processing facilities. Since it effectively uses string matching, it is an ideal language for searching text and manipulating text files, which is exactly what we | + | <div align=justify>Perl is a programming language that is powerful in text processing facilities. Since it effectively uses string matching, it is an ideal language for searching text and manipulating text files, which is exactly what we needed for retrieving and expanding EvoGEM's local database.</div> |
<br style="clear:both"/> | <br style="clear:both"/> |
Revision as of 01:51, 29 October 2008
Home | The Team | The Project | Modeling | Notebook |
---|
Evolutionary Algorithm | Data Retrieval | Modeling | Graphical User Interface |
---|
Contents |
Perl
- If it is an enzyme, what reactions does it catalyze?
- If it is a molecule, what is its molecular structure, and what are the synonyms for the molecule name?
UniProt
ChemSpider
The reagents from the reaction that the protein catalyzes are put through ChemSpider. This large database is much like UniProt except that it is for chemistry. Searching and querying in ChemSpider is quite simple because molecules can be queried using synonyms. After a molecule is queried, ChemSpider produces information about the molecule such as synonyms and SMILES, which is a simplified molecular input line entry specification. As useful as this information can be, the reason for coming for this database is to get something that is machine readable and can be used for comparisons of metabolic pathways. What is this machine readable format? This machine readable format is known as the IUPAC International Chemical Identifier (InChI). This InChi is a unique "fingerprint" of the molecule that is not ambiguous like SMILES and is supplied only by IUPAC. An example of an InChi would look like this:
1/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-8,10-11H,1H2/t2-,5+/m0/s1
To see this database, visit: [http://www.chemspider.com ChemSpider]
The Algorithm
Evolutionary Algorithm | Data Retrieval | Modeling | Graphical User Interface |
---|
Home | The Team | The Project | Modeling | Notebook |
---|