Aging
Navigate
Research Paper|Volume 17, Issue 11|pp 2778—2808

A natural language processing–driven map of the aging research landscape

Jose Perez-Maletzki1,2, Jorge Sanz-Ros3
  • 1Universidad Europea de Valencia, Faculty of Health Sciences, Department of Physiotherapy, Nutrition and Sports Sciences, Valencia 46010, España
  • 2Group of Physical Therapy in the Ageing Process: Social and Health Care Strategies, Department of Physical Therapy, Universitat de València, Valencia 46010, Spain
  • 3Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Received: May 27, 2025Accepted: November 7, 2025Published: November 25, 2025

Copyright: © 2025 Perez-Maletzki and Sanz-Ros. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Aging research has advanced significantly over the past century, from early studies on animal models to a current emphasis on clinical and translational applications. As research literature expands exponentially, traditional narrative reviews can no longer capture the field’s complexity, highlighting the need for new, unbiased synthesis tools. Here, we leverage advanced natural language processing (NLP) and machine learning (ML) techniques to analyze 461,789 abstracts related to aging published between 1925 and 2023. By integrating Latent Dirichlet Allocation (LDA), term frequency-inverse document frequency (TF-IDF) analysis, dimensionality reduction and clustering, we delineate a comprehensive thematic landscape of aging research. Our results show a clear shift: early decades focused on cellular and molecular mechanisms, while recent years emphasize clinical studies, especially neurodegenerative disorders. Notably, we identify a persistent divide between the biology of aging (BoA) and clinical research, with minimal conceptual overlap between them. Furthermore, we identify distinct clusters representing key biological processes, some of which may have previously been overlooked as cohesive research domains. Finally, we highlight both established and underexplored interconnections that could guide future research. This study outlines shifting priorities and translational gaps in aging research and offers a scalable, data-driven alternative to conventional reviews.