Article Page

DOI: 10.31038/JCRM.2025831

Abstract

Quantitative electroencephalography (qEEG) offers objective biomarkers of brain function across neuropsychiatric conditions, but clinical EEG case reports are traditionally labor-intensive to produce. We describe a reproducible Python-based pipeline that automatically processes raw BrainVision EEG data, extracts spectral qEEG features, integrates patient clinical scores (e.g. Brief Psychiatric Rating Scale, BPRS), retrieves relevant literature via Europe PMC, and uses a retrieval augmented large language model (RAG-LLM) to generate structured narrative case reports. EEG preprocessing (filtering, artifact removal, referencing) and feature computation (power in delta, theta, alpha, beta bands, etc.) are implemented using open-source MNE-Python tools in a BIDS-compliant [1,2] framework. Patient metadata such as age, diagnosis, and BPRS severity provide clinical context alongside EEG features (e.g. the known increase in theta power and theta/beta ratio in schizophrenia) [3]. Key EEG findings are combined with dynamically retrieved evidence from Europe PMC – an open access repository of ~36 million biomedical abstracts and 5 million full-text articles – to ground the report in up-to-date knowledge [4]. Using a RAG-LLM approach, the system formulates context-aware prompts that guide the model to cite recent studies and summarize findings. For example, prior work has shown retrieval-augmented LLMs significantly improve accuracy in clinical question answering [5] compared to base models, and in dedicated frameworks (e.g. EEG-MedRAG) unify EEG domain [6] knowledge and patient data for diagnostic guidance. Our pipeline yields a draft case report that mimics the structure of a clinician’s report: background, methods, results (EEG summary and clinical scores), and an evidence-supported discussion.

Keywords

qEEG, RAG, Clinical Data, EEG Reports, AI Neuropsychiatry, EEG Preprocessing, Spectral Features, BPRS, Case Automation, Neuroinformatics

Introduction

EEG remains an indispensable tool in neuroscience and psychiatry, providing noninvasive recordings of brain activity. Quantitative EEG (qEEG) – the analysis of EEG power spectrum bands – has been studied [3] as a putative biomarker in disorders such as schizophrenia, ADHD, depression and bipolar disorder [7]. For example, schizophrenia is often associated with increased delta/theta and reduced alpha [3] power. Clinical context is captured by symptom scales like the BPRS, which quantify severity of [3] psychiatric symptoms and are routinely collected in research studies. Integrating EEG features with clinical metrics can enhance interpretation (e.g. correlating theta increase with BPRS depression subscore). However, manually generating a cohesive case report that synthesizes EEG analyses with relevant literature is time-consuming and subjective.

Recent advances in large language models (LLMs) and retrieval-augmented generation (RAG) offer a new paradigm: an AI-driven pipeline can automatically assemble multimodal data and external knowledge into an explanatory narrative. LLMs have demonstrated capability in medical domains, but [5,8] they are prone to hallucination unless grounded by real data. RAG addresses this by coupling an LLM with a knowledge base: the model retrieves pertinent documents (here from Europe PMC) and [9] conditions its output on this evidence. Studies in healthcare have shown that RAG-augmented systems yield more accurate, up-to-date answers than base LLMs alone. For instance [5,8] Masanneck et al. tested multiple LLMs on neurology guidelines and found that a fixed-document RAG setup markedly improved accuracy over unfettered models, though caution remains for hallucinations and case-based scenarios. Similarly, Kuo et al. describe a hierarchical RAG pipeline that retrieves [5] heterogeneous clinical trial data and generates reports with higher factual consistency and greatly [8,10] reduced authoring time compared to manual methods. Inspired by such successes, we propose applying RAG to qEEG.

EEG Data Preprocessing and Spectral Feature Extraction

Raw EEG data (e.g. BrainVision.eeg/.vhdr files) are ingested and organized according to the Brain Imaging Data Structure (BIDS) for neurophysiology. We use MNE-Python to apply a standard [1] preprocessing chain: band-pass filtering (e.g. 1–50 Hz), removal of line noise, and artifact correction (automatic identification of bad channels, Independent Component Analysis for ocular/muscle artifacts). It is critical to document each step for reproducibility: we leverage the MNE-BIDS-Pipeline [2] framework, which provides scripted execution of preprocessing steps with caching and provenance tracking. This ensures that the exact filtering, referencing (common average or linked1 [11] mastoids), and artifact-rejection parameters are recorded for audit and reuse. Recent work emphasizes [12] that even such preprocessing choices can dramatically affect downstream analyses, so consistency is vital in a clinical research context.

From the cleaned continuous EEG, we compute spectral power in canonical bands (delta 1–4 Hz, theta 4–8 Hz, alpha 8–13 Hz, beta 13–30 Hz, etc.) using Welch’s method or multitapering. Relative band power and ratios (e.g. theta/beta) are computed per channel and averaged over regions of interest. These qEEG features are saved in a structured format (CSV or JSON) along with metadata such as channel montages and patient demographics. This numeric summary forms the quantitative core of the report (for example, “Global theta power was elevated to 150% of the normative mean, consistent with prior findings in schizophrenia”). We also compute EEG complexity or connectivity metrics (entropy, coherence) as advanced optional features. Importantly, the feature extraction is coded in Python using open-source libraries (MNE, SciPy), and the entire pipeline from raw data to feature table can be re-run end-to-end, fulfilling reproducibility standards in bioengineering.

Clinical Data Integration

In addition to EEG, we integrate patient-specific clinical information to contextualize findings. For example, we include diagnosis, medication status, age, and standardized scores such as the BPRS (for psychiatric symptoms), HAM-D (for depression), or MoCA (for cognition). These data may come from an electronic health record or study database. In our framework, clinical scores are merged with EEG results so that the LLM can mention them (e.g. “The patient’s BPRS score was 28, indicating moderate [3] schizophrenia symptoms”). Prior studies often correlate EEG power changes with symptom scales. As one example, Newson & Thiagarajan note that schizophrenia severity was assessed using PANSS and BPRS in most EEG studies. By including such measures, the generated report can explain how EEG abnormalities align (or do not align) with clinical severity. This multimodal integration also allows RAG to retrieve literature linking EEG markers and clinical metrics. For instance, a search combining “theta power schizophrenia BPRS” may yield studies discussing EEG predictors of symptom improvement, which the narrative can cite.

Literature Retrieval and RAG-based Report Generation

To produce an evidence-based narrative, the system queries Europe PMC for relevant literature. Europe PMC is an open-access life sciences repository containing ~36 million article abstracts and 5 million full texts. We construct search queries using patient context (e.g. “schizophrenia EEG theta”), methodological terms (e.g. “qEEG spectrum analysis software”), and any novel findings (e.g. “delta power increase clinical meaning”). Using the Europe PMC RESTful API (or associated Python libraries), the pipeline retrieves top-ranking abstracts and open-access full-texts matching these queries. The selection is filtered for recency and relevance (for example, the last 10 years, human studies, English language).

The core of report generation uses a Retrieval-Augmented Generation model. Retrieved documents (titles, snippets, or passages) form an evidence bank. We then prompt a large language model (e.g. GPT-4 or a fine-tuned domain model) with both the structured patient/EEG data and key excerpts from the literature. The prompt instructs the LLM to write a structured report, ensuring each statement is grounded in the retrieved evidence. For example, in the LLM prompt we include: “Patient is a 30-year-old with schizophrenia (BPRS 30) whose EEG shows elevated theta power (mean 8 µV²). According to [Smith et al. 2022], increased theta power correlates with positive symptoms. Summarize these findings in a report with 9 references.” The RAG approach has demonstrated improved factual accuracy in medical summaries. Our application is analogous to systems like Alzheimer RAG, which fuse textual and imaging data 13 for case studies, and the EEG-MedRAG framework which builds hypergraphs of EEG knowledge and 6 patient data for causal diagnosis generation.

The output is a draft report comprising sections: Background (patient demographics, clinical history), EEG Acquisition (recording details, preprocessing), Results (qEEG features with normative comparisons), and Discussion (interpretation citing literature). In Discussion, the LLM references specific studies (e.g., “The observed theta increase aligns with reports of frontal slowing in schizophrenia 3”) and notes if findings contradict literature (e.g. no alpha slowing despite expectation). Each cited fact is traced to a reference from Europe PMC to maintain transparency. The language model is steered to a “report-writing” style, using sentence templates extracted from sample case studies.

Reproducibility and Open Pipeline Implementation

A key design goal is full reproducibility. All code is in Python and managed with version control. The data flow is modular: BIDS validation ensures input conformity, MNE/BIDS-Pipeline handles [1,11] preprocessing with cacheable steps, and feature computation scripts log their parameters. We containerize the environment (e.g. with Docker or Conda) so others can recreate the exact software setup. The RAG component is also documented: the retrieval queries and LLM prompts are saved alongside the results. This allows independent verification of the report content.

To facilitate reuse by researchers, we leverage community standards. Using BIDS format means that any 1 EEG dataset following BIDS-EEG can be plugged into the pipeline. We provide example Jupyter notebooks that walk through each step on sample data. The pipeline can run in parallel for multiple subjects, supporting large-scale studies (as advertised by the MNE-BIDS-Pipeline for hundreds of 11 datasets). Summaries of processing (filter logs, artifact rejection rates, feature distributions) are automatically compiled into a report PDF, enabling quick quality checks. By being open-source, this framework advances the ethos of reproducible neuroengineering practice.

Clinical Research Utility, Education, and Future Directions

This automated reporting tool has several utilities. In clinical research, it accelerates the generation of case studies and cohort summaries. Investigators can use it to standardize EEG report content across studies, reducing variability. The inclusion of RAG ensures that reports cite current literature, keeping interpretations up to date, which is crucial in fields where biomarker validity is evolving. In neuroengineering education, the pipeline serves as a teaching aid: students can explore how 12 preprocessing choices affect features, and how AI can assist in interpreting neurophysiological data.

The system can generate example cases for training, highlighting how spectral changes relate to diagnosis and literature. Looking forward, the framework can be extended. Future versions might incorporate other modalities (e.g. MRI or genetics) into the RAG context, enabling truly multimodal case reports. Improving the LLM’s domain specificity (through fine-tuning on neuroengineering literature) could reduce errors. There is also potential for real-time use: integrating with EEG acquisition software to update reports as data are collected. From a bioengineering perspective, such tools illustrate how AI can bridge raw data and clinical insight, embodying the translational promise of neuroinformatics.

Strengths, Limitations, and Outlook

This approach offers major strengths: automation greatly reduces expert time, promotes consistency, and ties findings to evidence. By using open pipelines and data standards, it encourages [8,5] reproducible research and democratizes complex EEG analysis for non-experts. However, limitations remain. EEG preprocessing is sensitive; suboptimal filtering or artifact correction can mislead analysis [12-15]. The quality of the LLM report hinges on retrieval: if relevant literature is missed or irrelevant documents are retrieved, the narrative may be skewed or incomplete. As noted in prior RAG studies, LLMs can still hallucinate or oversimplify in clinical contexts, so reports must be reviewed by experts. Data privacy is also a concern: patient data used in prompts should be de-identified and handled under appropriate governance.

In future work, evaluation is critical: we plan systematic testing of report accuracy by comparing AI generated reports with those by neurophysiologists. Advances in domain-specific LLMs and larger EEG text corpora will likely improve performance. Overall, merging automated EEG analytics with retrieval augmented AI represents a promising direction in bioengineering — one that could transform how we synthesize physiological data, clinical scores, and biomedical knowledge into actionable insights.

References

  1. Newson JJ, Thiagarajan TC (2019) EEG frequency bands in psychiatric disorders: a review of resting state studies. Front Hum Neurosci. 12: 521. [crossref]
  2. Kessler R, Enge A, Skeide MA (2025) How EEG preprocessing shapes decoding performance. Commun Biol 8: 1039.
  3. Wang Y, Luo H, Meng L (2025) EEG-MedRAG: Enhancing EEG-based clinical decision-making via hierarchical hypergraph retrieval-augmented generation. arXiv 2508: 13735.
  4. Masanneck L, Meuth SG, Pawlitzki M (2025) Evaluating base and retrieval-augmented LLMs with document or online support for evidence-based neurology. NPJ Digit Med 8: 137.
  5. Kuo SM, Tai SK, Lin HY, Chen RC (2025) Automated Clinical Trial Data Analysis and Report Generation by Integrating Retrieval Augmented Generation (RAG) and LLM Technologies. AI 6: 188.
  6. Gramfort A, Luessi M, Larson E, Engemann DA, Strohmeier D, et al. (2013) MEG and EEG data analysis with MNE-Python. Front Neurosci 7: 267. [crossref]
  7. EMBL-EBI Literature Services. Europe PMC database and text mining infrastructure (2025). MNE-BIDS-Pi pipeline https://mne.tools/mne-bids-pipeline/stable/
  8. How EEG preprocessing shapes decoding performance | Communications Biology
  9. https://www.nature.com/articles/s42003-025-08464-3?error=cookies_not_supported&code=3c2c65df-d6ec-42bf-b872- dd2b7e5a3d96
  10. EEG Frequency Bands in Psychiatric Disorders: https://pmc.ncbi.nlm.nih.gov/articles/PMC6333694/A
  11. Review of Resting State Studies – PMC Literature Services – Europe PMC database and text mining infrastructure https://www.ebi.ac.uk/about/teams/literature-services/
  12. Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology | npj Digital Medicine
  13. https://www.nature.com/articles/s41746-025-01536 y?error=cookies_not_supported&code=529c2d4a-150c-43b9-a98c-625f5a07fd01 [2508.13735]
  14. EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation https://www.arxiv.org/abs/2508.13735
  15. Automated Clinical Trial Data Analysis and Report Generation by Integrating Retrieval. Augmented Generation (RAG) and Large Language Model (LLM) Technologies https://www.mdpi.com/2673-2688/6/8/188 AlzheimerRAG: Multimodal Retrieval Augmented Generation … – arXiv https://arxiv.org/html/2412.16701v1

Article Type

Case Report

Publication history

Received: September 04, 2025
Accepted: September 10, 2025
Published: September 12, 2025

Citation

Netanel Stern (2025) Automated qEEG Case Study Generation with Retrieval-Augmented AI and Clinical Data Integration. J Clin Res Med Volume 8(3): 1–7. DOI: 10.31038/JCRM.2025831

Corresponding author

Netanel Stern
XSCHIZQ Laboratory
Israel