Scalable linear classifiers based on exponential loss function

Abstract

We first propose an empirical risk minimization based binary classi- fication algorithm, ExpERM, with exponential function as surrogate loss function. Our ExpERM algorithm is scalable in both the di- mension of feature space as well as in the number of data points as it is an unconstrained differentiable convex optimization prob- lem or a convex optimization problem with a single constraint in the regularized ExpERM framework. Under mild assumption on data, we show that ExpERM classifier is unique. We implement ExpERM on wide collection (large-features, large-examples) of real datasets. We use Wilcoxon signed-rank test to show that these easily computable classifiers have good generalization property as their 5 or 10-fold cross validation error is not significantly different from those classifiers obtained by standard hinge loss based SVMs or AdaBoost classifiers. Using some large feature sized datasets, we make a statistical comparison of SVM and ExpERM and observe that ExpERM takes remarkably less time to train the model. Towards better understanding this simple and effective learning algorithm, we obtain PAC like sample complexity bounds for reg- ularized ExpERM and high probability bounds for ExpERM based on Rademacher complexity and uniform stability notions. Our computational experience suggests that the regularization does not significantly improve the performance of the ExpERM based classifier. Due to repeated calls to be made to the binary classifier routines in implementation of combined multi-class algorithms, ExpERM scheme is expected to be significantly computationally cheaper and hence scalable. We statistically show that CV error of our binary ExpERM classifiers on many multi-class datasets is comparable to those obtained using computationally expensive binary class SVMs.

Publication
CODS-COMAD ‘18: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data
Sandhya Tripathi
Sandhya Tripathi
Postdoctoral Research Associate

My research interests include clinical prediction model, fairness in AI models, database matching, and learning in the presence of label noise.