Project

A problem-specific non-dominated sorting genetic algorithm for supervised feature selection

Abstract

Feature selection (FS), which plays an important role in classification tasks, has been recently studied as a multi-objective optimization problem (MOP). In this paper, we consider minimizing three objectives of FS and propose a problem-specific non-dominated sorting genetic algorithm (PS-NSGA). In PS-NSGA, an accuracy-preferred domination operator is applied, which makes the individual with higher classification accuracy in the population more likely to survive. And a quick bit mutation is used, which breaks through the limitation of traditional bit string mutation and increases the efficiency. In addition, a mutation-retry operator and a combination operator are designed to make our algorithm converge faster and better. At last, a solution selection strategy is developed to determine the most proper feature subset from the obtained Pareto solutions. Experimental results on 15 real-world high-dimensional datasets demonstrate that our proposed algorithm can achieve competitive classification accuracy while obtaining a smaller size of feature subset compared with some state-of-the-art evolutionary and traditional FS algorithms.

框架流程图