Predicting Drug Recalls from Internet Search Engine Queries
Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.