- A systematic review of randomized clinical trials of machine-learning interventions has found a lack of high-quality studies of the technologies.
- The literature review, details of which were published in the journal JAMA Open Network, found 41 randomized clinical trials (RCTs) of medical machine-learning interventions. The RCTs failed to meet reporting guidelines and lacked participants from underrepresented minority groups.
- According to the study authors, the findings “highlight areas of concern regarding the quality of medical machine learning RCTs and suggest opportunities to improve reporting transparency and inclusivity.”
Researchers have investigated machine-learning models in areas including cancer diagnosis and decision support in intensive care. The investigations have generated many thousands of research papers but only led to tens of RCTs. In the systematic review, the researchers found almost 20,000 articles and 41 RCTs. The 41 RCTs involved a median of 294 participants.
As of the last count, the Food and Drug Administration listed 343 commercially available medical devices that are enabled by artificial intelligence or machine learning. The list is not meant to be exhaustive. As the review found 41 RCTs, the authors conclude that “most FDA-approved machine learning–enabled medical devices are approved without efficacy demonstrated in an RCT.”
In addition, many of the RCTs that were run had limitations. According to the study, none of the RCTs complied with all of the standards established through CONSORT-AI, a set of recommendations for clinical trial reports evaluating interventions with an AI component. Thirty-eight of the RCTs failed to meet the requirements related to the assessment of poor quality or unavailable input data and analyzing performance errors. Seven of the RCTs had a high risk of bias.
Diversity is another issue. The study found that just 11 of the RCTs reported race and ethnicity data. Among those trials, the median proportion of participants from underrepresented minority groups was 21%.