Title | ||
---|---|---|
Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm |
Abstract | ||
---|---|---|
A new method of outlier detection and data cleaning for both normal and non-normal multivariate data sets is proposed. It is based on an iterated local fit without a priori metric assumptions. We propose a new approach supported by finite mixture clustering which provides good results with large data sets. A multi-step structure, consisting of three phases, is developed. The importance of outlier detection in industrial modeling for open-loop control prediction is also described. The described algorithm gives good results both in simulations runs with artificial data sets and with experimental data sets recorded in a rubber factory. Finally, some discussion about this methodology is exposed. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1023/B:DAMI.0000031630.50685.7c | Data Min. Knowl. Discov. |
Keywords | DocType | Volume |
outlier detection,Multivariate Non-Normal Samples,non-normal,artificial data set,Outlier Detection,mixture model,experimental data,new approach,industrial modeling,em algorithm,data cleaning,cluster analysis,large data set,multivariate,non-normal multivariate data set,new method,finite mixture clustering,PAELLA Algorithm,outlier,good result | Journal | 9 |
Issue | ISSN | Citations |
2 | 1573-756X | 11 |
PageRank | References | Authors |
0.97 | 3 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Manuel Castejón Limas | 1 | 17 | 4.28 |
Joaquín Ordieres-Meré | 2 | 102 | 14.39 |
Francisco J. Martínez De Pisón Ascacibar | 3 | 11 | 0.97 |
Eliseo P. Vergara González | 4 | 11 | 0.97 |