Due to the widespread use of complex machine learning models in real-world
applications, it is becoming critical to explain model predictions. However,
these models are typically black-box deep neural networks, explained post-hoc
via methods with known faithfulness limitations. Generalized Additive Models
(GAMs) are an inherently interpretable class of models that address this
limitation by learning a non-linear shape function for each feature separately,
followed by a linear model on top. However, these models are typically
difficult to train, require numerous parameters, and are difficult to scale.
We propose an entirely new subfamily of GAMs that utilizes basis
decomposition of shape functions. A small number of basis functions are shared
among all features, and are learned jointly for a given task, thus making our
model scale much better to large-scale data with high-dimensional features,
especially when features are sparse. We propose an architecture denoted as the
Neural Basis Model (NBM) which uses a single neural network to learn these
bases. On a variety of tabular and image datasets, we demonstrate that for
interpretable machine learning, NBMs are the state-of-the-art in accuracy,
model size, and, throughput and can easily model all higher-order feature
interactions.