Objectives: With the rising number of fully sequenced genomes, the importance of high-throughput genome annotation is increasing. Although several automatic annotation software systems do exist, their quality and flexibility are limited. Therefore, we have developed the semi-automatic approach called DENTOTI (Etymology: English - dental +Latin totus whole, entire +English -i) to facilitate annotation of oral pathogen genomes hosted in the ORALGEN database (http://www.oralgen.lanl.gov/). DENTOTI was designed and implemented to support oral pathogen sequence analysis, metabolic reconstruction and comparative genome analysis. Methods: The DENTOTI approach is based on a common heuristic method of identifying orthologs using bidirectional best hits (Overbeek et al 1999): if the most similar sequence to protein A in genome 2 is B, and if the most similar sequence to protein B in genome 1 is A, then A and B are bidirectional best hits, and are operationally considered to be orthologs. This relationship is especially strong if the blast E value is very small and if the alignment of the proteins spans a majority of each sequence. Results: Applying the DENTOTI approach to 5 closely related genomes from the Streptococcus family: Streptococcus agalactiae, Streptococcus mitis, Streptococcus mutans, Streptococcus pneumoniae, Streptococcus sanguinis and Streptococcus thermophilus, we were able to construct the Streptococcus in Toto database and transfer gene functional annotation from previously annotated Streptococcus genomes to approximately 70% of the genes in Streptococcus sanguinis and Streptococcus mitis. The Streptococcus in Toto website can be visited at http://oralgen.lanl.gov/streptoto/. Conclusion: This approach is very useful for the initial assignment of function to genes in groups of closely-related species. This work is supported by NIH NIDCR under LANL contract no. Y1-DE-6006-02 |