Original Articles: 2018 Vol: 10 Issue: 9
A Novel Numerical Characterization of DNA Sequences Based on Two-Base and its Application
Abstract
Analyzing DNA sequences is a topic in bioinformatics. Traditionally, comparing DNA sequence is carried out by alignment method. However, it is extremely complex in time and space complexity. In the paper, a novel alignment-free method is proposed based on the position information of two adjacent nucleotides. A DNA sequence is transformed to a 48D vector, which includes frequency, mean value and variance of position for each two bases. The Euclidean distances for new vectors are calculated to carry on the similarity analysis. Finally, comparing Clustal W method with double nucleotides vector and single nucleotide vector.