Discovery and extraction of motifs and/or profiles in biological sequences
Loading...
Date
2016
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Université de M'sila
Abstract
Since the discovery of DNA sequencing by Frederick Sanger in the second
half of the 70s, the volume of biological sequences since that has increased
exponentially due to the advanced of computer technology, has brought to existence
a new research area, bioinformatics. In short, bioinformatics attempts to
conceptualize biology in terms of molecules (in the sense of physical-chemistry) and
applies informatics techniques to understand and organize the information
associated with these molecules on a large scale.
The biggest challenge remains to overcome, is the analysis and extraction of
knowledge from these data repositories. Indeed, these databases constitute the
genetic heritage of all humanity, like the human genome and different sequences of
plant and animal biological species identified so far, and this is one of the most
important challenges in bioinformatics is the discovery of motifs in biological
sequences in order to define the function or the family of biochemical molecules
(DNA, RNA, and Protein). Since this challenge depends on analyzing textual data,
pattern matching algorithms are a suitable candidate to tackle this problem.
In this thesis we designed a novel motif discovery algorithm to meet the demand for
finding motifs over biological sequences using pushdown automata as a mechanism
of matching process alongside with a counter in an optimistic way.
Description
Keywords
Sequences, Motifs, Patterns, DNA, RNA, Protein, Pattern Recognition, Data Mining, Knowledge Extraction, String Matching Algorithms, Motif Discovery Algorithms.