M. Ricamato, C. Marrocco, F. Tortorella
The class imbalance is a critical problem in classification tasks related to many real world applications. A large number of solutions were proposed in literature, both at the algorithmic and data levels. In this paper we analyze the second kind of approach and, in particular, we focus our attention on the use of Multiple Classification Systems where each classifier is trained on a dataset containing the minority class and a subset of the majority class samples. The aim of this approach is to avoid the drawbacks of other methods, commonly used in this context, which force a balanced distribution by oversampling the minority class. We compare the results obtained applying different realizations of the method on the UCI Repository datasets.
A file of this publication is available for download , for
personal use only . Click on the download button and enter your
email address in the box . You will receive an email with
instructions to proceed to download