Generative AI in medical devices: Seizing opportunities, keeping risks under control
Why isn’t GenAI a sure-fire success in medical technology? GenAI and LLMs like ChatGPT are entering MedTech fast. What seems convenient-assistants, automated documentation, dialog UIs-touches strict safety and regulatory requirements. This article outlines risks and how manufacturers manage them.
1. Misbehavior and hallucinations
LLMs do not work deterministically. This means that identical inputs can lead to different outputs and that content can be generated that is linguistically convincing but factually incorrect. This behavior is highly critical in a medical context because incorrect answers can influence diagnostic or therapeutic decisions. At the same time, it contradicts core regulatory requirements for predictability, controllability, and performance, as required by the MDR/IVDR, ISO 14971, or the future EU AI Act, for example.
2. LLMs as OTS or SOUP components
Many manufacturers rely on external, cloud-based LLMs whose functionality, training data, and update cycles they have little control over. This creates dependencies that are difficult to classify in traditional software lifecycle models from a regulatory perspective. An unexpected model update can change the behavior of the entire system without the manufacturer having any influence over it or being able to fully understand the changes. This makes an LLM the most complex form of SOUP—with significant implications for change control, risk management, and ultimately product compliance.
3. Data protection and information security
GenAI systems often process sensitive and context-specific inputs. Unclear data flows, the storage of prompts, or the use of this data for training purposes can directly conflict with the GDPR. In addition, there are regulatory requirements from the MDR and standards such as IEC 81001-5-1, which require systematic information security management in connection with software in medical technology. Manufacturers must understand and document exactly where data is processed, who has access to it, and how data is protected against misuse.
4. Challenges in verification and validation
Traditional software can be validated through clearly defined requirements, reproducible tests, and unambiguous acceptance criteria. With GenAI, this is only possible to a limited extent. LLMs change over time, respond differently to inputs, and cannot be tested completely deterministically without additional architectural measures. At the same time, new challenges arise during operation, such as model drift or unexpected usage patterns, which necessitate extended monitoring and continuous performance observation.
How these risks can be controlled
System delimitation and intended purpose
The most important measure is to clearly define the intended purpose and clearly delimit the system. The decisive question is whether the GenAI function is medically relevant or merely supportive. This delimitation determines classification, regulatory obligations, and risk management and validation requirements.
Technical Guidelines and Architectural Measures
The safe use of GenAI requires an architecture that limits the behavior of AI and makes it controllable. This includes, for example, the use of fixed knowledge bases, embedding in retrieval-augmented generation concepts, rule-based output filters, or mechanisms for evaluating response quality. Such measures transform the system from an open AI black box into a structured, traceable, and validatable functional module.
Human oversight as a central principle
GenAI should never act autonomously in a medical context. Instead, the suggestions generated must be reviewed and confirmed by qualified users. This principle not only addresses safety-related issues but also meets key requirements of the EU AI Act. Comprehensive logging of individual interactions also ensures regulatory traceability.
Structured SOUP and supplier management
Manufacturers need transparent agreements and strategies to control changes to external models. These include clearly defined update and rollback mechanisms, coordinated SLAs, and monitoring that detects model changes at an early stage. Since it is often not possible to directly influence the provider, reactive change management is becoming increasingly important in the context of post-market monitoring.
Advanced testing strategies and market observation
Traditional software testing is not sufficient for GenAI. In addition, robustness, bias risks, and behavior in borderline or error situations must be evaluated. Continuous performance monitoring throughout the product lifecycle ensures that the system remains within its defined limits and delivers its intended performance in a stable manner.
An AI Management System (AIMS)
Because GenAI has an impact on processes, roles, organization, and compliance, the introduction of AI should always be considered at the system level. An AIMS provides the basis for this. In combination with the QMS (ISO 13485) and an ISMS (ISO 27001), an integrated governance structure is created that consistently brings together AI risks, security requirements, and regulatory requirements.
Preparation for incidents and disaster recovery
Since AI-based systems can generate new types of malfunctions, a coherent emergency plan is essential. Manufacturers must be able to quickly return to a safe configuration or, in extreme cases, temporarily withdraw the product from the market. An extended recall and rollback capability is therefore an essential part of the overall system.
What ultimately matters to manufacturers
GenAI can significantly advance medical technology—but only if its use is understood not as a “feature” but as a strategic and regulatory-sensitive functional module. Manufacturers who take a holistic view of risks, architecture, regulation, and organization lay the foundation for safe, powerful, and sustainable AI-supported solutions. At the latest when GenAI is deployed, a binding AI policy should be established within the company that clearly describes usage scenarios, limitations, roles, validation concepts, labeling, and training requirements.
How SEQLY supports you on this path
SEQLY MedTech helps manufacturers design GenAI use cases that are safe and regulatory compliant, define system boundaries, establish AI risk management, and consolidate the requirements of MDR, IVDR, AI Act, and relevant standards. M&M Software complements this with in-depth engineering expertise in AI architectures, human oversight mechanisms, IEC 62304-compliant processes, and integrated cybersecurity. This creates a seamless connection between strategy, regulatory requirements, and robust technical implementation.