Title
Mining officially unrecognized side effects of drugs by combining web search and machine learning
Abstract
We consider the problem of finding officially unrecognized side effects of drugs. By submitting queries to the Web involving a given drug name, it is possible to retrieve pages concerning the drug. However, many retrieved pages are irrelevant and some relevant pages are not retrieved. More relevant pages can be obtained by adding the active ingredient of the drug to the query. In order to eliminate irrelevant pages, we propose a machine learning process to filter out the undesirable pages. The process is shown experimentally to be very effective. Since obtaining training data for the machine learning process can be time consuming and expensive, we provide an automatic method to generate the training data. The method is also shown to be very accurate. The side effects of three drugs which are not recognized by FDA are validated by an expert. We believe that the same approach can be applied to many real life problems and will yield high precision. Thus, this could lead a new way to perform retrieval with high accuracy.
Year
DOI
Venue
2005
10.1145/1099554.1099670
CIKM
Keywords
Field
DocType
relevant page,training data,drug name,machine learning,unrecognized side effect,irrelevant page,automatic method,active ingredient,high precision,web search,side effect,high accuracy,precision
Training set,Data mining,Information retrieval,Computer science,Artificial intelligence,Machine learning
Conference
ISBN
Citations 
PageRank 
1-59593-140-6
0
0.34
References 
Authors
11
5
Name
Order
Citations
PageRank
Carlo Curino1201290.35
Yuan-Yuan Jia2328.06
Bruce L. Lambert312916.41
Patricia M. West400.34
Clement T. Yu531711419.96