Interpreting Social Media-Based Substance Use Prediction Models with Knowledge Distillation - Citegraph

Paper Info

Title
Interpreting Social Media-Based Substance Use Prediction Models with Knowledge Distillation

Abstract
People nowadays spend a significant amount of time on social media such as Twitter, Facebook, and Instagram. As a result, social media data capture rich human behavioral evidence that can be used to help us understand their thoughts, behavior and decision making process. Social media data, however, are mostly unstructured (e.g., text and images) and may involve a large number of raw features (e.g., millions of raw text and image features). Moreover, the ground truth data about human behavior and decision making could be difficult to obtain at a large scale. As a result, most state-of-the-art social media-based human behavior models employ sophisticated unsupervised feature learning to leverage a large amount of unsupervised data. Unfortunately, these advanced models often rely on latent features that are hard to explain. Since understanding the knowledge captured in these models is important for behavior scientists, public health providers as well as policymakers, in this research, we focus on employing a knowledge distillation framework to build machine learning models with not only state-of-the-art predictive performance but also interpretable results. We evaluate the effectiveness of the proposed framework in explaining Substance Use Disorder (SUD) prediction models. Our best models achieved 87% ROC AUC for predicting tobacco use, 84% for alcohol use and 93% for drug use, which are comparable to existing state-of-the-art SUD prediction models. Since these models are also interpretable (e.g., a logistics regression model and a gradient boosting tree model), we combine the results from these models to gain insight into the relationship between a user's social media behavior (e.g., social media likes and word usage) and substance use.

Year	DOI	Venue
2018	10.1109/ICTAI.2018.00100	2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)
Keywords	Field	DocType
human behavior, social media, substance use disorders, explainable AI, interpretable AI	Data modeling,Word usage,Social media,Computer science,Decision tree model,Artificial intelligence,Predictive modelling,Feature learning,Decision-making,Machine learning,Gradient boosting	Conference
ISSN	ISBN	Citations
1082-3409	978-1-5386-7450-5	0
PageRank	References	Authors
0.34	6	4

Authors (4 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Tao Ding	1	15	8.48
Fatema Hasan	2	0	0.34
Warren K. Bickel	3	0	0.34
Shimei Pan	4	684	64.41

1