Discovery and extraction of motifs and/or profiles in biological sequences.

Loading...
Thumbnail Image

Date

2016-01-21

Journal Title

Journal ISSN

Volume Title

Publisher

Université de M'sila

Abstract

Since the discovery of DNA sequencing by Frederick Sanger in the second half of the 70s, the volume of biological sequences since that has increased exponentially due to the advanced of computer technology, has brought to existence a new research area, bioinformatics. In short, bioinformatics attempts to conceptualize biology in terms of molecules (in the sense of physical-chemistry) and applies informatics techniques to understand and organize the information associated with these molecules on a large scale. The biggest challenge remains to overcome, is the analysis and extraction of knowledge from these data repositories. Indeed, these databases constitute the genetic heritage of all humanity, like the human genome and different sequences of plant and animal biological species identified so far, and this is one of the most important challenges in bioinformatics is the discovery of motifs in biological sequences in order to define the function or the family of biochemical molecules (DNA, RNA, and Protein). Since this challenge depends on analyzing textual data, pattern matching algorithms are a suitable candidate to tackle this problem. In this thesis we designed a novel motif discovery algorithm to meet the demand for finding motifs over biological sequences using pushdown automata as a mechanism of matching process alongside with a counter in an optimistic way.

Description

Keywords

Sequences, Motifs, Patterns, DNA, RNA, Protein, Pattern Recognition, Data Mining, Knowledge Extraction, String Matching Algorithms, Motif Discovery Algorithms.

Citation