- FDA has published 10 guiding principles for Good Machine Learning Practice (GMLP) in the development of medical devices with its regulatory counterparts in Canada and the U.K.
- The principles, which FDA created with its peers at Health Canada and the U.K. Medicines and Healthcare products Regulatory Agency, are intended to promote development of safe and effective medical devices that use artificial intelligence and machine learning. The document is one of the deliverables laid out in the FDA's AI/ML software as a medical device (SaMD) action plan issued in January as it looks to establish a regulatory approach to the fast-developing field.
- FDA framed the principles as a starting point for international harmonization and is seeking feedback as part of its broader discussion of the regulatory framework for modifications to AI/ML-based SaMD.
This year may go down as the point that regulators started to try to get a handle on the use of AI and ML in medical devices. Over the past 10 months, FDA has issued an AI/ML action plan for regulating the technology in medical devices, the European Commission has released contentious plans for the entire AI field and the U.K. has proposed an overhaul of how it regulates AI as a medical device.
Now, the U.S. and U.K. have begun working together on a global initiative. Working with their peers at Health Canada, officials at FDA and the U.K.'s MHRA have laid out the following guiding principles:
- Multi-Disciplinary Expertise Is Leveraged Throughout the Total Product Life Cycle
- Good Software Engineering and Security Practices Are Implemented
- Clinical Study Participants and Data Sets Are Representative of the Intended Patient Population
- Training Data Sets Are Independent of Test Sets
- Selected Reference Datasets Are Based Upon Best Available Methods
- Model Design Is Tailored to the Available Data and Reflects the Intended Use of the Device
- Focus Is Placed on the Performance of the Human-AI Team
- Testing Demonstrates Device Performance during Clinically Relevant Conditions
- Users Are Provided Clear, Essential Information
- Deployed Models Are Monitored for Performance and Re-training Risks are Managed
Collectively, the principles cover concerns about the possible biases of algorithms, their applicability to clinical practice and the potential for them to evolve as they are used in the real world. FDA and its collaborators have expanded on each of the principles, explaining, for example, that developers need to have "appropriate controls in place to manage risks of overfitting, unintended bias or degradation of the model” when their systems are “periodically or continually trained after deployment."
The principles represent a starting point for further work, rather than the conclusion of the agencies' thinking about AI and ML. FDA and its partners said the principles "are intended to lay the foundation for developing Good Machine Learning Practice" and identify areas where bodies such as the International Medical Device Regulators Forum could work to advance and harmonize the field.
Equipped with the 10 principles, the agencies envision the international community adopting good practices proven in other fields, tailoring practices from other sectors to the needs of medical devices and creating new approaches specific to the industry.
Jeff Shuren, director of the FDA's Center for Devices and Radiological Health, said at last month's AdvaMed conference that the agency is putting these efforts "on steroids" while touting the more than 300 AI/ML devices granted authorization to date via 510(k), De Novo and PMA pathways, most of which have radiology applications.
FDA expects the technology to become an important part of an increasing number of medical devices used across other fields of medicine.
However, Shuren warned at an FDA workshop earlier this month about the need for better methodologies for identification and improvement of algorithms prone to mirroring "systemic biases" in the healthcare system and the data used to train AI/ML-based devices.
"It's essential that the data used to train [these] devices represent the intended patient population with regards to age, gender, sex, race and ethnicity," Shuren said.