The FDA authorised 1,451 AI-enabled medical devices by the end of 2025. A record 295 came through in 2025 alone. Every one of those submissions started with a training dataset that had to hold up under regulatory scrutiny.
Grayde.ai, a life science AI data company and division of Milestone Localization, today released The AI SaMD Data Playbook, a free ebook for ML engineers, data scientists, and regulatory teams building AI-enabled SaMD.
In January 2025, the FDA issued draft guidance proposing a more detailed set of expectations for training data, annotation, fine-tuning, and evaluation.
It signals a clear direction: data practices once treated as internal engineering decisions are heading toward formal review as part of the safety and effectiveness evidence in a submission.
Separately, the FDA has already finalized guidance on Predetermined Change Control Plans (PCCP), the mechanism that governs how AI models can be updated after clearance, giving teams firmer ground to plan against on that front.
The playbook breaks down what this means in practice:
- Which submission pathway applies: 510(k), De Novo, or PMA
- What a regulatory-ready dataset is expected to look like
- Who should be qualified to label your data, and how to document it
- How to identify and disclose bias before a reviewer does
- Which evaluation metrics FDA reviewers are likely to expect
- What a finalized PCCP needs to cover for models that keep learning
"Most teams treat training data as an engineering detail, separate from the regulatory submission," said Nikita Agarwal, Founder of Grayde.ai. "Some of this guidance is still in draft and open to comment, but the direction is clear enough to start preparing for. This playbook helps teams get ahead of where the requirements are heading, instead of scrambling once they're finalized."
Checklist included:
The guide also includes a comprehensive 37-point pre-submission checklist divided into eight sections. It draws on FDA draft and final guidance documents, IMDRF's Good Machine Learning Practice principles, and published regulatory literature. It is intended for educational purposes and does not constitute regulatory or legal advice.
How to download it:
The AI SaMD Data Playbook is available now as a free download on Grayde.ai's website.
Grayde.ai builds regulatory-ready training data for AI-enabled medical devices. Every label is produced by a credentialed life science specialist, fully attributed, and documented to the standard FDA submissions require. The company serves SaMD developers, medical AI labs, and pharma AI teams.
Grayde.ai is a division of Milestone Localization, an ISO 9001, 13485, and 17100-certified life science company.