CathAI Assistant: An Integrated .NET and NVIDIA MONAI Platform for Coronary Angiography Analysis

John Godel
1d
1.3k
0
1

Article

Abstract

CathAI Assistant is a proof of concept system that combines modern web technologies, GPU accelerated medical imaging pipelines, and large language models in order to support coronary angiography interpretation. The solution ingests de identified DICOM angiograms, performs automated coronary segmentation and vessel highlighting using a NVIDIA and MONAI pipeline, and exposes the results through an ASP.NET Core web application. Clinicians can browse cases, select specific images or cine frames, visualize vessel overlays, and read multi paragraph narrative summaries generated from the structured analysis. This article describes what the system is doing technically, outlines the architecture and implementation strategy, and discusses the potential impact and future evolution of the platform toward real time cath lab assistance and clinical decision support.

1. Background and Motivation

Coronary angiography remains a core diagnostic and interventional modality in cardiology, yet interpretation is highly operator dependent and time sensitive. Typical cath lab workflows involve rapid visual assessment of multiple projections, mental integration of anatomy and lesion severity, and manual documentation. There is a growing need for systems that can:

Structure angiographic information into machine readable form.
Provide consistent vessel level analysis across projections.
Generate clear, clinically oriented summaries for reporting and communication.

Recent advances in medical imaging AI (for example MONAI based segmentation models) and large language models make it possible to combine pixel level analysis with natural language explanations. CathAI Assistant is designed as a bridge between those capabilities and real cath lab workflows, starting with a focused four week proof of concept that is technically credible and clinically interpretable, but deliberately scoped as a non regulated prototype.

2. System Overview

CathAI Assistant is built as an end to end pipeline that starts from de identified DICOM studies and ends at a web based user interface. The main components are:

A de identification and DICOM handling layer.
A GPU based imaging backend using NVIDIA hardware and MONAI.
An ASP.NET Core web application for case management and visualization.
A vision service abstraction that decouples the application from specific AI engines.
A large language model based summarization layer that generates multi paragraph narratives for each analyzed frame.

2.1 De identification and DICOM handling

Coronary angiogram studies are exported from the hospital PACS into a non production environment under controlled processes. Before any AI processing, a de identification pipeline:

Removes or replaces protected health information in DICOM headers (patient identifiers, dates, institution and operator details, UIDs).
Optionally masks or crops any burned in text in the pixel data if present in the fluoroscopic images.

The implementation uses Orthanc as a local DICOM server with de identification rules, supplemented by Python scripts based on pydicom where finer control is required. The output is a set of de identified DICOM studies that retain all imaging content and technical metadata but are safe to process in external environments. These studies are registered as “cases” in the CathAI application, with a folder structure that maps images and derived artifacts to individual cases.

2.2 Imaging AI backend with NVIDIA and MONAI

The imaging backend runs on a dedicated NVIDIA GPU node. In the current proof of concept this is a single workstation class system configured with Linux, CUDA, PyTorch, and MONAI. Orthanc provides DICOM ingest and retrieval, while a MONAI based pipeline performs:

Preprocessing: frame extraction from cine loops, normalization of image intensities, orientation and view handling for selected projections.
Inference: coronary segmentation on 2D frames, producing vessel masks and basic highlighting of major structures such as LAD, LCX, RCA, and key branches.
Postprocessing: packaging the results as overlay images and structured JSON that describes the detected vessels, anatomical context, and qualitative regions of interest.

This backend is exposed through a lightweight REST API. Given a request that references a de identified image, the API returns an object that includes the overlay (for visualization) and structured findings (for downstream summarization and storage).

2.3 ASP.NET Core web application

The clinician facing interface is an ASP.NET Core application. It provides:

Authentication and authorization for controlled access.
A case list view that displays de identified cath cases.
A per case gallery that shows images or representative frames from cine loops.

When a user clicks on an image, the application opens a modern modal dialog that:

Displays the original image or frame.
Allows toggling of the AI generated vessel overlay.
Shows key structured findings and a multi paragraph narrative summary that describes what is visible in that image.

The application logs activity and can be extended with audit trails and role based access controls as the solution matures.

2.4 Vision service abstraction

A key design element is the IVisionImageAnalysisService interface, which abstracts the imaging AI from the rest of the application. Implementations of this interface are responsible for:

Accepting an image path and optional contextual information.
Invoking the MONAI and NVIDIA based backend as required.
Returning a DetailedFrameSummaryResult that contains status flags, error messages if any, and textual summaries.

The current implementation, MonaiImageAnalysisService, calls the external MONAI inference endpoint, stores any overlay images, builds a structured textual summary of the MONAI result, and then drives the LLM based narrative generation. This separation makes it possible to swap models or backends without destabilizing the ASP.NET application.

2.5 Large language model summarization

The final step in the pipeline transforms structured imaging findings into a narrative form that is more useful to clinicians. The summarization component:

Receives a structured summary from the MONAI result, including view type, coronary system context, identified structures, and any additional notes.
Optionally uses the overlay image or the original frame as part of a multimodal prompt.
Calls a large language model to generate clear, clinician oriented text with explicit constraints, such as a minimum number of paragraphs and avoidance of technical artifacts like file names or JSON details.
Normalizes paragraph spacing and returns the result as part of DetailedFrameSummaryResult.

When the LLM is unavailable, a deterministic fallback generates multi paragraph text based purely on the structured summary. This ensures that the UI always has a meaningful explanation, even without external services.

3. Current Proof of Concept Scope

The first implementation is deliberately scoped to a four week proof of concept. Its objectives are to:

Demonstrate end to end viability: from de identified DICOM ingest through GPU based analysis to web based visualization and summarization.
Provide a realistic user experience where a clinician can click on images, see overlays, and read narrative summaries.
Limit technical risk by focusing on a single GPU node, a constrained set of projections and views, and non real time batch processing.

The proof of concept does not attempt to solve every integration problem. Out of scope items include:

Real time streaming directly from cath lab equipment.
Production PACS write back using DICOM SEG or SR.
Integration with hospital EHR through HL7 or FHIR.
Comprehensive multi vendor, multi protocol model generalization.
Regulatory submissions or clinical decision support claims.

This scope allows rapid iteration and feedback while maintaining technical rigor in the core components.

4. Potential and Future Directions

Although the current system is positioned as a technical demonstrator, it has significant potential across several dimensions.

4.1 Workflow support in the cath lab

Once extended to real time streaming and integrated with Holoscan, the same architecture can support:

Live overlays during diagnostic angiography, helping operators visualize vessel trees and complex branching patterns.
Frame selection assistance, highlighting the most informative frames for documentation and quantitative analysis.
Automatic capture of structured findings that seed the final procedure report.

The narrative summaries can evolve into preliminary report drafts that cardiologists review and edit, reducing documentation burden and improving consistency.

4.2 Quantitative analysis and longitudinal tracking

With reliable segmentation and consistent vessel labeling, the platform can be extended to:

Compute quantitative metrics such as approximate vessel diameters, lesion lengths, and relative narrowing across frames.
Track changes across follow up procedures, such as disease progression or restenosis at treated sites.
Provide structured outputs that are suitable for registry submissions and research databases.

This moves beyond per frame interpretation and into longitudinal disease tracking.

4.3 Training data generation and research

CathAI Assistant also provides a framework for building better models:

Segmentation outputs, overlays, and narrative summaries create a rich record that can be reviewed, corrected, and used as training data.
The combination of structured data and free text makes it easier to explore weak supervision and self supervision strategies.
Researchers can query the database for specific patterns of anatomy, pathology, or devices, linked to both images and textual descriptions.

This supports iterative improvement of both the imaging models and the language based explanations.

4.4 Integration with enterprise systems

In later phases, the same architecture can be integrated with existing hospital infrastructure:

PACS write back of DICOM SEG and SR objects, so that segmentations and structured findings are visible in standard viewers.
EHR integration through FHIR or HL7 interfaces, enabling automatic population of reports and procedure notes.
Role based and audit controlled access via the hospital identity system.

These steps move the system from a research prototype toward a clinically integrated platform.

5. Technical Challenges and Open Questions

Several technical and scientific questions remain and represent opportunities for further work.

Generalization across devices and protocols
Coronary angiography varies by vendor, acquisition protocol, and operator style. Robust segmentation and vessel labeling across this diversity requires careful model design and large, curated datasets.
Real time constraints
Transitioning from offline or near real time processing to continuous real time overlays in the cath lab imposes strict latency and reliability requirements. This will drive decisions about model complexity, pipeline optimization, and hardware selection.
Explainability and error handling
While narrative summaries are useful, they must be calibrated and transparent. The system needs mechanisms to convey uncertainty, avoid overstatement, and highlight when AI outputs should not be trusted without further review.
Evaluation and validation
Quantitative metrics for both segmentation quality and narrative usefulness must be defined. This involves clinical reader studies, inter observer agreement, and possibly prospective trials in shadow mode.
Regulatory and safety considerations
Moving toward clinical use will require alignment with relevant standards and regulatory guidance, as well as integration into hospital risk management and quality frameworks.

6. Conclusion

CathAI Assistant demonstrates how a modern technical stack can be assembled to support coronary angiography analysis in a realistic, clinically relevant manner. By combining a de identified DICOM pipeline, a NVIDIA and MONAI based imaging backend, an ASP.NET Core web application, and large language model based summarization, the system provides a complete loop from raw images to interpretable overlays and narrative explanations.

The current proof of concept is intentionally scoped as a non regulated, four week implementation, but it establishes the architectural patterns and integration points that are needed for future evolution. With further work on generalization, real time performance, enterprise integration, and validation, the same approach can serve as the foundation for a new class of cath lab tools that augment human expertise with reproducible, structured, and explainable AI support.