BulDML at Institute of Mathematics and Informatics: Derivation of Context-free Stochastic L-Grammar Rules for Promoter Sequence Modeling Using Support Vector Machine

	Home

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Sign on to:
	Receive email updates
	My DSpace authorized users
	Edit Profile


	About DSpace

BulDML at Institute of Mathematics and Informatics >
ITHEA >
International Book Series Information Science and Computing >
2008 >
Book 2 Advanced Research in Artificial Intelligence >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10525/1036

Title:	Derivation of Context-free Stochastic L-Grammar Rules for Promoter Sequence Modeling Using Support Vector Machine
Authors:	Damaševičius, Robertas
Keywords:	Stochastic Context-Free L-Grammar DNA Modeling Machine Learning Data Mining Bioinformatics
Issue Date:	2008
Publisher:	Institute of Information Theories and Applications FOI ITHEA
Abstract:	Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.
URI:	http://hdl.handle.net/10525/1036
ISSN:	1313-0455
Appears in Collections:	Book 2 Advanced Research in Artificial Intelligence

Files in This Item:

File	Description	Size	Format
IBS-02-p13.pdf		178.17 kB	Adobe PDF	View/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons License