Title
Ransac-Gp: Dealing With Outliers In Symbolic Regression With Genetic Programming
Abstract
Genetic programming (GP) has been shown to be a powerful tool for automatic modeling and program induction. It is often used to solve difficult symbolic regression tasks, with many examples in real-world domains. However, the robustness of GP-based approaches has not been substantially studied. In particular, the present work deals with the issue of outliers, data in the training set that represent severe errors in the measuring process. In general, a datum is considered an outlier when it sharply deviates from the true behavior of the system of interest. GP practitioners know that such data points usually bias the search and produce inaccurate models. Therefore, this work presents a hybrid methodology based on the RAndom SAmpling Consensus (RANSAC) algorithm and GP, which we call RANSAC-GP. RANSAC is an approach to deal with outliers in parameter estimation problems, widely used in computer vision and related fields. On the other hand, this work presents the first application of RANSAC to symbolic regression with GP, with impressive results. The proposed algorithm is able to deal with extreme amounts of contamination in the training set, evolving highly accurate models even when the amount of outliers reaches 90%.
Year
DOI
Venue
2017
10.1007/978-3-319-55696-3_8
GENETIC PROGRAMMING, EUROGP 2017
Keywords
Field
DocType
Genetic programming, RANSAC, Robust regression, Outliers
Data point,RANSAC,Computer science,Outlier,Genetic programming,Robustness (computer science),Robust regression,Artificial intelligence,Estimation theory,Symbolic regression,Machine learning
Conference
Volume
ISSN
Citations 
10196
0302-9743
1
PageRank 
References 
Authors
0.40
10
6
Name
Order
Citations
PageRank
Uriel López131.46
Leonardo Trujillo24111.33
Yuliana Martínez3425.70
Pierrick Legrand49016.20
Enrique Naredo5495.55
Sara Silva618312.53