Hierarchical Imitation Learning via Subgoal Representation Learning for Dynamic Treatment Recommendation - Citegraph

Paper Info

Title
Hierarchical Imitation Learning via Subgoal Representation Learning for Dynamic Treatment Recommendation

Abstract
ABSTRACTDynamic Treatment Recommendation (DTR) is a sequence of tailored treatment decision rules which can be grouped as individual sub-tasks. As the reward signals in DTR are hard to design, Imitation Learning (IL) has achieved great success as it is effective in mimicking doctors' behaviors from their demonstrations without explicit reward signals. As a patient may have several different symptoms, the behaviors in doctors' demonstrations can often be grouped to handle individual symptoms. However, a single flat policy learned by IL is difficult to mimic doctors' demonstrations with such hierarchical structure, where low-level behaviors are switching from one symptom to another controlled by high-level decisions. Due to this observation, we consider Hierarchical Imitation Learning methods as good solutions for DTR. In this paper, we propose a novel Subgoal conditioned HIL framework (short for SHIL), where a high-level policy sequentially sets a subgoal for each sub-task without prior knowledge, and the low-level policy for sub-tasks is learned to reach the subgoal. To get rid of prior knowledge, a self-supervised learning method is proposed to learn an effective representation for each subgoal. More specifically, we carefully designed to encourage diverse representations among different subgoals. To demonstrate that SHIL is able to learn meaningful high-level policy and low-level policy that accurately reproduces complex doctors' demonstrations, we conduct experiments on a real-world medical data from health care domain, MIMIC-III. Compared with state-of-the-art baselines, SHIL improves the likelihood of patient survival by a significant margin and provides explainable recommendation with hierarchical structure.

Year	DOI	Venue
2022	10.1145/3488560.3498535	WSDM
Keywords	DocType	Citations
Dynamic Treatment Recommendation, Hierarchical Imitation Learning, Subgoal Representation Learning	Conference	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Lu Wang	1	15	3.57
Ruiming Tang	2	39	7.21
Xiaofeng He	3	1648	137.87
Xiuqiang He	4	312	39.21

1