Abstract
Stacked generalization (also called stacking) is an ensemble method in machine learning that uses a metamodel to combine the predictive results of heterogeneous base models arranged in at least one layer. K-fold cross-validation is employed at the various stages of training in this method. Nonetheless, another validation strategy is to try out several splits of data leading to different train and test sets for the base models and then use only the latter to train the metamodel—this is known as blending. In this work, we present a modification of an existing visual analytics system, entitled StackGenVis, that now supports the process of composing robust and diverse ensembles of models with both aforementioned methods. We have built multiple ensembles using our system with the two respective methods, and we tested the performance with six small- to large-sized data sets. The results indicate that stacking is significantly more powerful than blending based on three performance metrics. However, the training times of the base models and the final ensembles are lower and more stable during various train/test splits in blending rather than stacking.
Original language | English |
---|---|
Title of host publication | 2021 23rd International Conference on Control Systems and Computer Science (CSCS) |
Publisher | IEEE Electromagnetic Compatibility Society |
Pages | 1-8 |
Number of pages | 8 |
ISBN (Print) | 978-1-6654-3940-4 |
DOIs | |
Publication status | Published - 28 May 2021 |
Event | 2021 23rd International Conference on Control Systems and Computer Science (CSCS) - Bucharest, Romania Duration: 26 May 2021 → 28 May 2021 |
Conference
Conference | 2021 23rd International Conference on Control Systems and Computer Science (CSCS) |
---|---|
Period | 26/05/21 → 28/05/21 |
Keywords
- Training
- Measurement
- Computer science
- Visual analytics
- Computational modeling
- Stacking
- Machine learning