PERFORMANCE EVALUATION OF A LARGE LANGUAGE MODEL-BASED TOOL FOR NUTRITIONAL RECOMMENDATIONS IN CHRONIC KIDNEY DISEASE

  • Carlos Matias Callegari Asociación Nefrológica de Buenos Aires
  • Gonzalo Garcia Sociedad Argentina de Nefrología
  • Cristina Milano 1) Asociación Nefrológica de Buenos Aires
  • Judith Leibovich Asociación Nefrológica de Buenos Aires
  • Florencia Cardone Asociación Nefrológica de Buenos Aires

Abstract

Introduction: Nutritional management of chronic kidney disease (CKD) is an essential component of treatment; however, its implementation faces multiple challenges due to the complexity of dietary recommendations and the shortage of specialized professionals. Large language models (LLMs) offer the possibility of complementing professional consultations through virtual assistance tools, but their specific performance in the area of ​​renal nutrition has not yet been adequately evaluated. Objective: To evaluate the performance of NutriRenal, a virtual assistant based on a large language model adjusted using a prompt designed by experts, through an evaluation by nutritionists specializing in CKD in response to nutrition-related queries from patients with CKD. Methods:A descriptive, cross-sectional study was conducted in which three specialized nutritionists evaluated 211 responses generated by NutriRenal to questions formulated by nephrologists. Responses were classified into three dimensions: comprehensibility, completeness, and consistency with scientific evidence, using a scale of 1 to 3. Differences were analyzed before and after the prompt's adjustment, as well as by CKD stage, presence of diabetes, and evaluator. Results: After the prompt's adjustment, NutriRenal demonstrated high performance: 99% of responses were rated as adequate in comprehensibility, 86.7% in completeness, and 95.2% in consistency with scientific evidence. These improvements were statistically significant compared to the original prompt. Performance was consistent across the different subgroups evaluated, with patients with diabetes showing the best scores. Conclusions: NutriRenal demonstrated robust performance after the rapid adjustment, generating high-quality responses according to the evaluated professional criteria. Its implementation could be a valuable complement to traditional nutritional consultations in patients with CKD. However, further studies in real-world clinical settings are needed to validate its impact on daily clinical practice.

 

Published
2025-12-10
How to Cite
1.
Callegari CM, Garcia G, Milano C, Leibovich J, Cardone F. PERFORMANCE EVALUATION OF A LARGE LANGUAGE MODEL-BASED TOOL FOR NUTRITIONAL RECOMMENDATIONS IN CHRONIC KIDNEY DISEASE. Rev Nefrol Dial Traspl. [Internet]. 2025Dec.10 [cited 2025Dec.16];45(04):173-8. Available from: http://revistarenal.org.ar/index.php/rndt/article/view/1111
Section
Original Article