Title
Galaxy: Towards Scalable and Interpretable Explanation on High-Dimensional and Spatio-Temporal Correlated Climate Data
Abstract
Interpretability has become a major criterion for designing predictive models in climate science. High interpretability can provide qualitative understanding between the meteorological variables and the climate phenomena which is helpful for climate scientists to learn causes of climate events. However, detecting the features which have efficient interpretability to observed events is challenging in spatio-temporal climate data because the key features may be overlooked by the redundancy due to the high degree of spatial and temporal correlations among the features, especially in high dimensionality. Furthermore, climate events occurred in different regions or different times may have different explanatory patterns, detecting explanations for overall climate phenomena is also difficult. Here we propose Galaxy, a new interpretable predictive model. Galaxy allows us to represent the explanatory patterns of subpopulations within an overall population of the target. Each explanatory pattern is defined by the smallest feature subset that the conditional distribution of target actually depends on, which we define as the minimal target explanation. Based on the detection of such explanatory patterns, Galaxy can detect the Galaxy space, the explanations for the overall target population, by sequentially detecting target explanation of every individual subpopulation of the target, and then forecast the target variable by their ensemble predictive power. We validate our approach by comparing Galaxy to several state-of-the-art baselines in a set of comparative experiments and then evaluate how Galaxy can be used to identify the explanatory space and give a referential explanation report in a real-world scenario on a given location in the United States.
Year
DOI
Venue
2018
10.1109/ICBK.2018.00027
2018 IEEE International Conference on Big Knowledge (ICBK)
Keywords
Field
DocType
Interpretable explanation, Long-lead rainfall forecasting, AdaBoost
Population,Interpretability,Conditional probability distribution,Predictive power,Curse of dimensionality,Feature extraction,Redundancy (engineering),Artificial intelligence,Galaxy,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-5386-9126-7
0
0.34
References 
Authors
10
6
Name
Order
Citations
PageRank
Yong Zhuang125413.88
David L. Small200.34
Xin Shu3316.50
Kui Yu43610.16
Shafiqul Islam5337.94
Wei Ding683472.61