Integration of Classification and Pattern Mining: A Discriminative and Frequent Pattern-based Approach

H. Cheng

Many existing classification methods assume the input data is in a feature vector representation. However, in many tasks, the predefined feature space is not discriminative enough to distinguish different classes. More seriously, in many other applications, the input data has no predefined feature vector, such as transactions, sequences, graphs, and semistructured data. For both scenarios, a primary challenge is how to construct a discriminative and compact feature set. Besides popularly investigated machine learning and statistical approaches, frequent pattern mining can be considered as another approach. The direction is interesting because frequent patterns are usually statistically significant and semantically meaningful. The objective of this project is to use discriminative frequent patterns to characterize complex structural data and thus enhance the classification power. I developed a framework of discriminative frequent patternbased classification which could lead to a highly accurate, efficient and interpretable classifier on complex data.