您现在的位置首页 » 科学研究 » 科研成果 » 正文

Variable selection is a vital problem in expert systems and in knowledge discovery. However, traditional variable selection methods achieve low accuracies on large and complex datasets because they do not consider the importance of variable combinations. A variable combination is a variable set in which the combined variables fit better than do other, redundant variables, but each of which, considered individually, may fit worse. Therefore, an association-based evolutionary ensemble variable selection method for multiple linear regression is proposed. Both greedy algorithms and stochastic algorithms are applied under an ensemble framework. Instead of ranking variables according to selection frequency, the associations among variables in local solutions are considered to identify important variable combinations. Experiments show that the proposed approach has a competitive performance for variable selection and outperforms some classic methods on large-scale datasets. The study is important for automated feature discovery for complex systems with multiple combined input variables.

版权所有: 同济大学CIMS研究中心 地址:上海市曹安公路4800号同济大学嘉定校区电信学院大楼(智信楼)

电话:021-65988911-8628 传真:021-65983673  邮箱:cims@tongji.edu.cn