xgboost原生包中有一个dump_model方法,这个方法能帮助我们看到基分类器的决策树如何选择特征进行分裂节点的,使用的基分类器有两个特点: 二叉树: 特征可以重复选择,来切分当前节点所含的数据集. 由dump_model生成的booster格式如下: 我们可以对该类型的树结构进行解析,得到这个基分类器中特征用来分裂的频率,简单的脚本如下: # -*- coding: utf-8 -*- import re with open('./tree_like.txt', 'r') as f: l
use LWP::UserAgent; use POSIX; use HTML::TreeBuilder::XPath; use Encode; use HTML::TreeBuilder; open DATAFH,">csdn.html" || die "open csdn file failed:$!"; my $ua = LWP::UserAgent->new; $ua->timeout(10); $ua->env_proxy; $ua
<pre name="code" class="python">use LWP::UserAgent; use POSIX; use HTML::TreeBuilder::XPath; use DBI; use Encode; use utf8; use HTML::TreeBuilder; open DATAFH,">csdn.html" || die "open csdn file failed:$!";