Aging
Navigate
Research Paper|Volume 12, Issue 5|pp 4445—4462

Distribution patterns of microsatellites and development of its marker in different genomic regions of forest musk deer genome based on high throughput sequencing

Wen-Hua Qi1,2, Ting Lu2, Cheng-Li Zheng3, Xue-Mei Jiang4, Hang Jie5, Xiu-Yue Zhang2, Bi-Song Yue2, Gui-Jun Zhao5
  • 1Chongqing Engineering Laboratory of Green Planting and Deep Processing of Three Gorges Reservoir Famous-region Drug, College of Biology and Food Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China
  • 2Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
  • 3Sichuan Institute of Musk Deer Breeding, Chengdu 611830, P. R. China
  • 4College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China
  • 5Chongqing Engineering Technology Research Center for GAP of Genuine Medicinal Materials, Chongqing Institute of Medicinal Plant Cultivation, Chongqing 408435, P. R. China
* Equal contribution
Received: November 2, 2019Accepted: February 25, 2020Published: March 10, 2020

Copyright © 2020 Qi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Forest musk deer (Moschus berezovskii, FMD) is an endangered artiodactyl species, male FMD produce musk. We have sequenced the whole genome of FMD, completed the genomic assembly and annotation, and performed bioinformatic analyses. Our results showed that microsatellites (SSRs) displayed nonrandomly distribution in genomic regions, and SSR abundances were much higher in the intronic and intergenic regions compared to other genomic regions. Tri- and hexanucleotide perfect (P) SSRs predominated in coding regions (CDSs), whereas, tetra- and pentanucleotide P-SSRs were less abundant. Trifold P-SSRs had more GC-contents in the 5′-untranslated regions (5'UTRs) and CDSs than other genomic regions, whereas mononucleotide P-SSRs had the least GC-contents. The repeat copy numbers (RCN) of the same mono- to hexanucleotide P-SSRs had different distributions in different genomic regions. The RCN of trinucleotide P-SSRs had increased significantly in the CDSs compared to the transposable elements (TEs), intronic and intergenic regions. The analysis of coefficient of variability (CV) of P-SSRs showed that the RCN of mononucleotide P-SSRs had relative higher variation in different genomic regions, followed by the CV pattern of RCN: dinucleotide P-SSRs > trinucleotide P-SSRs > tetranucleotide P-SSRs > pentanucleotide P-SSRs > hexanucleotide P-SSRs. The CV variations of RCN of the same mono- to hexanucleotide P-SSRs were relative higher in the intron and intergenic regions, followed by that in the TEs, and the relative lower was in the 5'UTR, CDSs and 3'UTRs. 58 novel polymorphic SSR loci were detected based on genotyping DNA from 36 captive FMD and 22 SSR markers finally showed polymorphism, stability, and repetition.