BOOMER is an algorithm for learning ensembles of gradient boosted multi-output rules that integrates with the popular scikit-learn machine learning framework. It allows to train a machine learning model on labeled training data, which can afterwards be used to make predictions for unseen data. In contrast to prominent boosting algorithms like XGBoost or LightGBM, the algorithm is aimed at multi-output problems. On the one hand, this includes multi-label classification problems, where individual data examples do not only correspond to a single class, but may be associated with several labels at the same time. Real-world applications of this problem domain include the assignment of keywords to text documents, the annotation of multimedia data, such as images, videos or audio recordings, as well as applications in the field of biology, chemistry and more. On the other hand, multi-output regression problems require to predict for more than a single numerical output variable.
To provide a versatile tool for different use cases, great emphasis is put on the efficiency of the implementation. Moreover, to ensure its flexibility, it is designed in a modular fashion and can therefore easily be adjusted to different requirements. This modular approach enables implementing different kind of rule learning algorithms. For example, this project does also provide a Separate-and-Conquer (SeCo) algorithm based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.
This document is intended for end users of our algorithms and developers who are interested in their implementation. In addition, the following links might be of interest:
For a detailed description of the methodology used by the algorithms, please refer to the list of publications.
The source code maintained by this project can be found in the Github repository.
Issues with the software, feature requests, or questions to the developers should be posted via the project’s issue tracker.