决策树剪枝方法的比较
Comparison among Methods of Decision Tree Pruning
-
摘要: 为在决策树剪枝中正确选择剪枝方法,基于理论分析和算例详细地比较了当前主要的4种剪枝方法的 计算复杂性、剪枝方式、误差估计和理论基础.与PEP相比,MEP产生的树精度较小且树较大;REP是最简单的 剪枝方法之一,但需要独立剪枝集;在同样精度情况下,CCP比REP产生的树小.如果训练数据集丰富,可以选 择REP,如果训练数据集较少且剪枝精度要求较高,则可以选用PEP.Abstract: To select a suitable pruning method in decision tree pruning, four well-known pruning methods were compared in terms of computational complexity, traversal strategy, error estimation and theoretical principle by taking a classification and regression tree as an example. Compared with pessimistic error pruning (PEP), minimum error pruning (MEP) is less accurate and produces a larger tree. Reduced error pruning (REP) is one of the simplest pruning strategies, but it has the disadvantage of requiring a separate data set for pruning. Cost-complexity pruning (CCP) produces a smaller tree than REP with similar accuracy. Practically, if the training data is abundant, REP is preferable; and if the train data is the expected accuracy is high but with limited data, PEP is good choice.
点击查看大图
计量
- 文章访问数: 1476
- HTML全文浏览量: 70
- PDF下载量: 112
- 被引次数: 0