Machine learning - retrospective case study

Question

Can machine learning reduce the time spent on screening abstracts and titles for Heart Group reviews?

Method

Active learning

Results

39 reviews with a total of 146243 records screened to identify 1807 (1.2%) relevant records. This equals two researchers' full time working hours for 7 months. 

For a 100% recall, the mean percentage needed to screen fell between 1%-95%. 

In total, only 36% of records would have needed to be screened to identify all included records if this machine learning algorithm would have been used. This is equivalent to 874 hours of screening time, or roughly 3 months full time work by two researchers. 

We are still analysing the data and will aim to publish the full results in due course. 

Conclusions

Machine learning can save screening time but we found a great variability of performance across our reviews.