Aging
Navigate
Research Paper|Volume 8, Issue 11|pp 2635—2654

Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes

Wen-Hua Qi1,2, Chao-chao Yan1, Wu-Jiao Li1, Xue-Mei Jiang3, Guang-Zhou Li4, Xiu-Yue Zhang1, Ting-Zhang Hu2, Jing Li1, Bi-Song Yue1
  • 1Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, China
  • 2College of Life Science and Engineering, Chongqing Three Gorges University, Chongqing 404100, China
  • 3College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing 404100, China
  • 4College of Sport and Health, Chongqing Three Gorges University, Chongqing 404100, China

* * Equal contribution

Received: June 8, 2016Accepted: August 22, 2016Published: September 16, 2016

Abstract

As the first systematic examination of simple sequence repeats (SSRs) and guanine-cytosine (GC) distribution in intragenic and intergenic regions of ten primates, our study showed that SSRs and GC displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation. Our results suggest that the majority of SSRs are distributed in non-coding regions, such as the introns, TEs, and intergenic regions. In these primates, trinucleotide perfect (P) SSRs were the most abundant repeats type in the 5'UTRs and CDSs, whereas, mononucleotide P-SSRs were the most in the intron, 3'UTRs, TEs, and intergenic regions. The GC-contents varied greatly among different intragenic and intergenic regions: 5'UTRs > CDSs > 3'UTRs > TEs > introns > intergenic regions, and high GC-content was frequently distributed in exon-rich regions. Our results also showed that in the same intragenic and intergenic regions, the distribution of GC-contents were great similarity in the different primates. Tri- and hexanucleotide P-SSRs had the most GC-contents in the 5'UTRs and CDSs, whereas mononucleotide P-SSRs had the least GC-contents in the six genomic regions of these primates. The most frequent motifs for different length varied obviously with the different genomic regions.