Figure 1. The main method utilized in the work. (A) The general pipeline of the work. (B) Predominant topics for the grant and (C) PubMed texts identified by BertTopic. (D) Distribution of token lengths for protein-coding genes. (E) The number of unique tokens placed in the noted positions within the gene name.