Download PDFOpen PDF in browser

Classification of DNA Sequence Using Machine Learning Techniques

EasyChair Preprint no. 8603

5 pagesDate: August 4, 2022

Abstract

DNA, the blueprint of life, a long repeating chain of nucleic acids, contains the genetic information of living organisms. Information extraction from DNA is an important research topic in genomics. The process of determining the order of base pairs is called DNA sequencing and the activity of identifying whether or not an unlabeled sequence corresponds to an existing class is known as DNA sequence classification. This paper presents several machine learning techniques for DNA sequence classification using two public datasets. Promoters and splice datasets are used to assess the approaches' effectiveness and achieve noteworthy improvements in that datasets. Among all experimented schemes, only two of them have less than 90 percent accuracy in training the data sets and most of the techniques achieve more than 90 percent test accuracy. The results of the experiment reveal that several techniques outperform all other models.

Keyphrases: AdaBoost, Decision Tree, DNA sequence, DNA sequence classification, Gaussian processes, K-Nearest Neighbour, logistic regression, machine learning, Multi Layer Perceptron, Naive Bayes, Random Forest, Support Vector Machine

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:8603,
  author = {Md. Ahsan Habib and Md. Motaleb Hossen Manik},
  title = {Classification of DNA Sequence Using Machine Learning Techniques},
  howpublished = {EasyChair Preprint no. 8603},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser