In this study we set to explore the potentialities of the inter-genomic symbols distance for finding CpG islands in DNA sequences. We explore the distance distributions of the inter CpG and SS distance in the independent nucleotide context (reference). We confront the empirical results from the complete human genome, CpG islands and non CpG islands, with the corresponding reference results. We propose a model to discriminate CpG islands based on some statistical properties of the inter-dinucleotide distances distributions in DNA sequences. The results of this exploratory study suggest that inter-SS symbols distance has high ability to discriminate CpG islands.
|Communications in Computer and Information Science
|30th EURO Mini-conference on Optimization in the Natural Sciences, EmC-ONS 2014
|5/02/15 → 9/02/15