Object-Oriented Analysis Bioinformatics Databases Protein Fingerprints Pattern Recognition
Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Serdica Journal of Computing, Vol. 12, No 3, (2018), 175p-190p
A new approach, based on object-oriented programming, for search and classification of protein patterns in a protein database is developed. Aiming at an improvement of the analysis of protein sequences and structures, certain basic patterns – such as motifs and fingerprints – are used. An unknown protein sequence can be classified by searching for those fingerprints from the database, assessing matches by estimating the statistical significance. The current implementation of the model algorithm for searching is a powerful tool that uses the PRINTS database as an example, however it does not support the option of adding of new features due to the conservative design of the program and the lack of publicly available code. A new version of the PRINTS database has recently been developed and this will require adding new features in the future. A novel object-oriented model for the implementation of the algorithm is proposed. This model is used to build a web application prototype, written in Python—the most widely used programming language in bioinformatics at present. The result of this study is a maintainable software with open source code that can easily be extended with new functionalities.