Harnessing AI for Valuable Real-World Insights in Multiple Sclerosis: Insights from Rebekah Foster, MBA, and John Foley, MD, FAAN

At ECTRIMS 2025, clinicians and data scientists presented concrete examples of how AI insights can transform fragmented electronic health records into robust, research-grade registries for multiple sclerosis (MS). New methods automate abstraction from unstructured notes and MRI reports, enabling scalable measurement of relapses, disability progression, and treatment patterns. The work described by Rebekah Foster and John Foley illustrates practical deployments, vendor partnerships, and technical architectures that move AI insights from pilot projects to clinical and regulatory use. Short, evidence-driven summaries below explore the model architectures, validation strategies, vendor ecosystems, and implications for clinical care and industry stakeholders.

AI insights for EHR abstraction in multiple sclerosis: CHARM model and real-world registry building

The adoption of automated curation tools has created new possibilities for deriving structured data from historically unstructured EHR content. AI insights here refer to the specific actionable elements extracted: diagnosis confirmation, disease subtype, relapse events, EDSS estimation, MRI lesion counts, and therapy timelines. The CHARM (Century Health Abstraction and Retrieval Model) effort presented at ECTRIMS demonstrates feasibility at scale, extracting estimated EDSS for nearly all visits in a Utah deidentified EHR cohort spanning 2011–2025.

Problem statement: fragmented EHRs and missing variables

Clinical records for MS are often stored across multiple systems with a heavy reliance on narrative notes and radiology reports. Traditional registries that depend on structured fields or manual abstraction miss critical clinical nuance and limit cohort size. AI insights enable automated recovery of that nuance, improving sample size and representativeness.

Common missing elements in manual registries: lesion descriptions, MRI sequences, relapse timing.
Opportunity: recovering discontinuation reasons and therapy sequences from clinician notes.
Impact: expanding cohorts and shortening time-to-analysis for safety and efficacy studies.

CHARM optimized an LLM-based pipeline to map natural-language phrasing to discrete variables. The model’s abstraction scope included MS diagnosis, disease subtype, therapy details and discontinuation reasons, estimated EDSS, relapses, and MRI outcomes. Performance highlights from the study included EDSS estimation for 97.9% of visits and a median derived EDSS of 3.0; 79.5% of patients had at least one relapse documented in the dataset. These figures are illustrative of how AI insights make otherwise inaccessible metrics available for downstream analyses.

Technical elements and workflows

Successful EHR abstraction pipelines combine several technical layers: ingestion and normalization, text parsing with named entity recognition, temporal alignment of events, and rule-based adjudication for ambiguous cases. Integrations with vendor APIs and cloud providers accelerate deployment, but governance is crucial to maintain reproducibility.

Data ingestion: secure de-identification and normalization across hospital systems.
Language models: fine-tuned LLMs and ensemble classifiers for entity extraction.
Adjudication rules: heuristics to reconcile conflicting statements across notes and imaging reports.

From a deployment standpoint, partnerships with established healthcare technology vendors ease integration. Microsoft and Google Health cloud services often provide foundational compute and managed data services. Vendor-specific solutions such as IBM Watson Health and Siemens Healthineers contribute image-processing toolchains for MRI report normalization. Close collaboration with clinical groups like the Rocky Mountain MS Clinic and data firms such as Nira Medical streamlines clinical validation and pragmatic implementation.

Concrete example: a mid-sized neurology practice partnered with an analytics startup to run CHARM-style abstraction on five years of EHR data. Within weeks, the practice identified unexpected therapy-switch patterns and two clusters of patients with early disability progression previously unreported in their manual registry. The practice used those AI insights to prioritize specific patients for care management interventions.

Variable Extracted	CHARM Performance	Clinical Use Case
Estimated EDSS	Derived for 97.9% of visits	Longitudinal disability progression studies
Relapse events	Documented for 79.5% of patients	Effectiveness and safety of DMTs
MRI lesion descriptors	Automated lesion presence & location	Imaging endpoints in real-world studies

Key takeaways for implementers include robust de-identification, iterative clinician-in-the-loop validation, and continuous monitoring for drift. These AI insights become truly valuable when integrated with downstream analytics for drug safety, comparative effectiveness, and operational improvement. Final insight: automated EHR curation unlocks data at scale, but governance and clinician oversight remain the decisive factors for trustworthy deployment.

AI insights for generating synthetic outcomes and disability measures in MS

One of the most consequential advances reported at ECTRIMS was the ability to produce synthetic clinical measures—most notably synthetic EDSS scores—derived from unstructured notes. These AI insights enable researchers to quantify disability trajectories without the need for standardized trial scales recorded at every visit. Synthetic outcomes expand analytical reach beyond tightly controlled trials into heterogeneous real-world populations.

Why synthetic measures matter

Synthetic EDSS scores and similarly derived endpoints address a major gap: observational care settings rarely record standardized outcome metrics at consistent intervals. AI insights help infer progression signals, facilitating comparative effectiveness research and pharmacovigilance in larger, more representative cohorts.

Synthetic EDSS supports lifetime trajectory modeling for disease progression.
Allows safety analyses across broader patient populations, including underserved groups.
Enables retrospective evaluation of new therapies’ real-world impact sooner than prospective registries.

Practical validation comes from concordance studies comparing synthetic EDSS to clinician-assigned scores in subsets where both are available. In the Utah cohort, derived EDSS matched clinical impressions at a high rate, enabling confidence in longitudinal trend analyses. Moreover, synthetic relapses identified by CHARM-type models enable quick safety signal detection for biologic therapies, such as B-cell treatments, which accounted for 84.1% of treated patients in the cohort.

Regulatory and methodological considerations

Regulators increasingly accept real-world evidence when methods are transparent and validated. For synthetic outcomes to be acceptable for regulatory or payer use, the generation process must be auditable, reproducible, and anchored to clinical truth where possible. This requires robust documentation of model training data, performance metrics, and clinician adjudication results.

Document model lineage and training corpus.
Report sensitivity, specificity, and calibration for key endpoints.
Include clinician adjudication and blinded validation cohorts.

Vendors and partners including IQVIA and Tempus bring experience in regulatory-grade real-world evidence generation and can help operationalize analytic frameworks. Collaboration with pharmaceutical sponsors like Biogen and Roche often focuses on endpoint harmonization and post-market surveillance. Industry players should also consider cross-vendor validation using frameworks described in resources such as the AI observability and risk articles available for technical teams.

Example case: a pharmacovigilance team used synthetic relapse counts to detect an uptick in adverse events following a formulary change. Collaboration with site clinicians confirmed the signal, prompting a targeted chart review and a rapid safety communication. The speed and scale of that detection illustrate how AI insights provide early-warning capabilities that were previously unattainable.

Essential insight: synthetic measures must be treated as validated instruments rather than black-box outputs. Proper governance and transparent validation turn AI insights into credible evidence for clinicians, regulators, and sponsors.

AI insights for industry stakeholders: Pharma, health systems and technology vendors

AI insights reshape the priorities of several stakeholder groups in MS research and care. Pharmaceutical companies gain improved safety and effectiveness evidence. Health systems obtain population-level disease management tools. Technology vendors offer integrated pipelines to operationalize EHR abstraction. Coordinating these stakeholders requires clear value propositions and interoperability standards.

Stakeholder-specific applications

Each stakeholder uses AI insights differently, and successful deployment rests on aligning incentives and technical capabilities. For example, payers may emphasize cost-of-care and functional outcomes, while sponsors focus on endpoint integrity and cohort comparability.

Pharma (Biogen, Roche): post-market evidence, label expansions, and trial enrichment.
Health systems and clinics: population health management and quality improvement initiatives.
Tech vendors (Microsoft, IBM Watson Health, Siemens Healthineers, Google Health): infrastructure, imaging analytics, and model governance.

Integrations across these groups often involve cloud providers and specialized health AI platforms. Microsoft’s healthcare initiatives and Google Health’s imaging tools provide foundational capabilities. Philips Healthcare and GE Healthcare contribute imaging device interoperability and DICOM standard compliance. Meanwhile, IQVIA and Tempus offer large-scale data harmonization and analytic services, enabling sponsors and health systems to scale insights across regions.

Stakeholder	Primary Use of AI insights	Representative Vendors/Partners
Pharma	Real-world endpoint validation, safety surveillance	Biogen, Roche, IQVIA
Health systems & clinics	Population management, clinical decision support	Rocky Mountain MS Clinic, Nira Medical
Technology vendors	Cloud infrastructure, imaging analytics, model deployment	Microsoft, Google Health, IBM Watson Health, Siemens Healthineers, Philips Healthcare, GE Healthcare

Collaborative pilots often demonstrate value quickly. For instance, a pilot partnership between a neurology clinic and a cloud provider used AI insights to identify high-risk patients for care management, reducing unplanned hospitalization days. The project leveraged existing vendor APIs, and insights were integrated into clinician workflows via a decision-support dashboard.

Best practice: co-develop performance metrics and KPIs with all stakeholder groups.
Risk mitigation: address data sovereignty, vendor lock-in, and reproducibility with contractual and technical safeguards.
Scale path: start with local pilots, validate with clinical endpoints, then expand across systems.

Implementation lessons include prioritizing interoperability and keeping clinicians engaged through frequent feedback loops. Companies must also address third-party AI risks and compliance frameworks; resources like articles on third-party AI risk and compliance in the AI era provide practical guidance for risk managers and engineers. Concluding insight: aligning incentives and technical requirements across stakeholders accelerates conversion of AI insights into measurable clinical and commercial outcomes.

AI insights technical validation, observability, and cybersecurity for clinical-grade deployment

Transitioning research prototypes into clinical-grade tools requires rigorous validation pipelines, observability, and cybersecurity controls. Model performance must be continuously monitored to detect drift, bias, and hallucinations. Observability frameworks help ensure that AI insights remain reproducible and explainable across changing clinical documentation practices.

Validation and observability practices

Validation begins with holdout datasets and external validation across sites. It extends into post-deployment monitoring where model outputs are compared against clinician adjudications and outcome measures. Observability includes logging inputs, model decisions, and drift metrics to enable root-cause analysis when discrepancies appear.

Pre-deployment: cross-site validation, bias assessment, and explainability checks.
Post-deployment: drift detection, periodic re-training, and clinician feedback loops.
Auditability: versioned models, dataset snapshots, and documented decision pathways.

Cybersecurity considerations are equally vital. EHR-derived pipelines handle sensitive health data and must conform to best practices for encryption, access control, and incident response. Published resources on AI hallucinations and cybersecurity provide practical attack scenarios and mitigations for teams responsible for production systems. In addition, attention to third-party dependencies and vendor security posture is necessary; a dedicated review of supply-chain risk often prevents embarrassing breaches.

Area	Key Controls	Relevant Concerns
Model validation	External validation cohorts, clinician adjudication	Overfitting, site-specific bias
Observability	Input/output logging, drift detection	Silent degradation, undocumented changes
Cybersecurity	Encryption, access control, incident playbooks	Data exfiltration, model tampering

Experienced vendors can support these controls: cloud providers and enterprise players such as Microsoft and Google Health offer integrated observability and security tooling. Independent vendors focusing on AI observability and adversarial testing can supplement internal capabilities. Practical implementation also benefits from policy-level engagement, including alignment with standards and stakeholder education about model limits and uncertainty.

Operational recommendation: deploy models initially in advisory modes with mandatory clinician review.
Security recommendation: conduct periodic red teaming and third-party audits for vendor components.
Governance recommendation: maintain cross-disciplinary governance boards including clinicians, data scientists, and security leads.

Example: a hospital system implemented a CHARM-style abstraction pipeline behind a strict access-controlled environment. After a six-month observability period and a targeted security audit, the system moved to decision-support mode for therapy review meetings. The observability logs enabled fine-grained improvements to entity extraction performance, reducing false positives in relapse detection by a measurable margin.

Final insight: robust validation, observability, and security are not optional—they are the foundation that turns AI insights into trusted clinical instruments.

Our opinion

AI insights now provide a pragmatic route to converting fragmented EHRs into high-value real-world evidence for multiple sclerosis. The CHARM example and the work discussed by experts demonstrate that synthetic outcomes, automated abstraction, and scalable validation pipelines can deliver clinically relevant metrics at scale. The combination of cloud infrastructure, vendor capabilities, and clinician oversight is the practical recipe for impact.

Strategic recommendations for stakeholders

For sponsors and health systems aiming to leverage AI insights, the strategy is straightforward: prioritize pilot projects that include clinician adjudication, select interoperable vendor stacks, and commit to observability and security practices. Collaboration with established partners—whether cloud providers or healthcare-specialized vendors—accelerates time-to-value.

Start with well-scoped pilots that map to a clear clinical or regulatory question.
Engage clinicians from day one to define gold standards and adjudication rules.
Invest in observability, documentation, and security to maintain trust and compliance.

Practical resources and further reading are abundant and should be consulted by technical leads. Relevant topics include AI costs and management strategies, third-party AI risks, and cybersecurity posture for AI systems. For teams seeking to broaden their technical competence, materials on AI observability architecture and practical AI testing help bridge the gap from prototype to production. Example links that provide technical and strategic context include pieces on AI costs and management strategies, third-party AI risks, and AI observability architecture.

Finally, industry collaboration with established vendors—Microsoft, IBM Watson Health, Siemens Healthineers, Google Health, Philips Healthcare, GE Healthcare—and research-oriented partners such as IQVIA, Tempus, Biogen, and Roche will remain central to scaling AI insights. These alliances combine clinical perspective, regulatory know-how, and technological reach to make AI-driven registries both useful and compliant.

Immediate action: identify a 6–12 month pilot using automated abstraction for a defined MS outcome.
Mid-term action: validate synthetic outcomes against clinician-assigned metrics and publish methods for transparency.
Long-term action: integrate insights into care pathways and regulatory submissions as trustworthy, auditable evidence.

Invite reflection: stakeholders should weigh the speed and scale afforded by AI insights against the obligation to validate, secure, and govern those systems. When those tradeoffs are managed deliberately, AI insights become an essential accelerator for better care, more responsive research, and smarter post-market surveillance. Readers are encouraged to explore technical resources and cross-industry case studies to align their next steps with proven practices and vendor capabilities.

Further reading and technical resources: explore practical guides and case studies on AI in healthcare, the role of observability in production AI, and management strategies for AI costs and governance from trusted sources and technical communities: AI insights MS management, third-party AI risks, AI observability architecture, AI costs and management strategies, and AI hallucinations and cybersecurity threats.