Regularized SVM Classification with a new Complexity-Driven Stochastic Optimizer

J. Andrew Howe, Hamparsum Bozdgoan

Abstract

Given a multivariate dataset composed of data from different known sources or processes, how can we create a rule to separate the data, and classify any future data? Kernel discriminant analysis is one of many supervised learning techniques that handle this problem. Recently, in this and other knowledge discovery problems, kernel methods have gained popularity. This is somewhat ironic as another common theme is variable reduction, and kernel methods actually inflate dimensionality. Due to the substantial benefits of processing "kernelized" data, this is excusable - kernel methods frequently outperform traditional classification techniques for real data when the classes are not easily separable. In performing kernel discriminant analysis, there are two main issues that we address in this article. The first is that, in the literature, the question of which kernel function to use is often subjectively selected a prior, or determined by cross-validation with the sole objective of maximizing classification performance. Secondly, after obtaining discriminant functions or support vectors to classify a dataset, how do we know which of our variables are most responsible for, and important to, the classification? In this research, we develop a new regularized algorithmthat simultaneously selects the kernel function and subset of original variables. Our algorithm, a hybrid of cross-validation and the genetic algorithm, does this by optimizing a function that rewards correct classification while penalizing model complexity and misclassification.

We report results on three real datasets, including data from a medical imaging study. For the latter, we obtained an impressively low misclassification rate of 0.3%, while reducing the number of features from p = 20 to p∗ = 6.

Keywords

Supervised classification, Discriminant analysis, Support vectors, Information criteria, Feature selection, Stochastic optimization, Reproducing kernel Hilbert space

Full Text:

PDF