Building High Performance Explainable Machine Learning Models For Social Media-Based Substance Use Prediction - Citegraph

Paper Info

Title
Building High Performance Explainable Machine Learning Models For Social Media-Based Substance Use Prediction

Abstract
Social media contain rich information that can be used to help understand human mind and behavior. Social media data, however, are mostly unstructured (e.g., text and image) and a large number of features may be needed to represent them (e.g., we may need millions of unigrams to represent social media texts). Moreover, accurately assessing human behavior is often difficult (e.g., assessing addiction may require medical diagnosis). As a result, the ground truth data needed to train a supervised human behavior model are often difficult to obtain at a large scale. To avoid overfitting, many state-of-the-art behavior models employ sophisticated unsupervised or self-supervised machine learning methods to leverage a large amount of unsupervised data for both features learning and dimension reduction. Unfortunately, despite their high performance, these advanced machine learning models often rely on latent features that are hard to explain. Since understanding the knowledge captured in these models is important to behavior scientists and public health providers, we explore new methods to build machine learning models that are not only accurate but also interpretable. We evaluate the effectiveness of the proposed methods in predicting Substance Use Disorders (SUD). We believe the methods we proposed are general and applicable to a wide range of data-driven human trait and behavior analysis applications.

Year	DOI	Venue
2020	10.1142/S021821302060009X	INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS
Keywords	DocType	Volume
Human behavior, personal traits, social media, substance use disorders, explainable AI, causal inference	Journal	29
Issue	ISSN	Citations
3-4	0218-2130	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Tao Ding	1	15	8.48
Fatema Hasan	2	0	0.34
Warren K Bickel	3	12	3.51
Shimei Pan	4	684	64.41

1