Author Type

Graduate Student

Date of Award

Spring 4-23-2026

Document Type

Thesis

Publication Status

Version of Record

Submission Date

May 2026

Department

Computer and Electrical Engineering and Computer Science

College Granting Degree

College of Engineering and Computer Science

Department Granting Degree

Electrical Engineering and Computer Science

Degree Name

Master of Science (MS)

Thesis/Dissertation Advisor [Chair]

Xingquan (Hill) Zhu

Abstract

The first step of biomedical NLP is recognizing clinical named entities, which consist of identifying and categorizing a variety of clinical entities such as diseases, symptoms, genetics, diagnostic tests, procedures, etc. from a body of unstructured clinical text. This study presents a PubMed and UMLS based Retrieval Augmented Generation framework which improves the performance of the Large Language Models to identify clinical entities by providing context. In particular, the framework consists of a two-stage pipeline, where candidate tokens are identified from initial LLM-based classification and refined with retrieved context from either PubMed or UMLS. The proposed framework is assessed across two established biomedical datasets, the NCBI Disease Corpus (binary classification) and MedMentions (multi class classification) and assessed using three LLMs, LLaMA-70B, Qwen-35B, and GPT-5. The results of the evaluation indicate that retrieval-based on PubMed-source consistently improved or maintained F1 scores. Therefore, the results indicate that retrieval-source selection is a critical aspect of retrieval-augmented generation biomedical NLP systems.

Recommended Citation

Tripathi, Apoorv, "LLM FOR CLINICAL NAMED ENTITY RECOGNITION: A STUDY ON RAG WITH PUBMED and UMLS" (2026). Electronic Theses and Dissertations. 345.
https://digitalcommons.fau.edu/etd_general/345

Download

Included in

Computer Engineering Commons

COinS

Electronic Theses and Dissertations

LLM FOR CLINICAL NAMED ENTITY RECOGNITION: A STUDY ON RAG WITH PUBMED and UMLS

Author Type

Date of Award

Document Type

Publication Status

Submission Date

Department

College Granting Degree

Department Granting Degree

Degree Name

Thesis/Dissertation Advisor [Chair]

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Electronic Theses and Dissertations

LLM FOR CLINICAL NAMED ENTITY RECOGNITION: A STUDY ON RAG WITH PUBMED and UMLS

Author

Author Type

Date of Award

Document Type

Publication Status

Submission Date

Department

College Granting Degree

Department Granting Degree

Degree Name

Thesis/Dissertation Advisor [Chair]

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner