Improving glass-box models with various variables transformations
Choosing between interpretability and model performance dilemma is gaining popularity. Authors of this article tested various variables’ transformations to increase the performance of glass-box models. To test all things they used the Concrete_Data dataset from the OpenML database. The data is technical, about different proportions of ingredients in building materials and the model aims to predict the compressive strength of created material.
After getting to know our dataset that we want to work on we need to use feature engineering techniques to improve our model. Sometimes, we do it by hand, tweaking it until we get satisfying results, but we can also leave it to automated function that we call, to handle it for us.