Decision Tree

Decision Tree builds classification or regression models in the form of a tree structure. It break down dataset into smaller and smaller subsets while an associated decision tree in incrementally developed at the same time.

Decision Tree learning use top-down recursive method. The basic idea is to construct one tree with a fastest declines of information entropy, the entropy value of all instance in each leaf nodes is zero. Each internal node of the tree corresponding to an attribute, and each leaf node corresponding to a class label.
Advantages:

Decision is easy to explain. It results in a set of rules. It is the same approach as humans generally follow while making decisions.
Interpretation of a complex Decision Tree can be simplified into visualization.It can be understood by everyone.
It almost have no hyper-parameter.

Infomation Gain

The entropy is:
By the information entropy, we can calculate their Experience entropy:

where:
we can also calculate their Experience conditions entropy:
By the information entropy, we can calculate their information gain：
Information gain ratio:
Gini index:

For binary classification:

For binary classification and on the condition of feature A:

Three Building Algorithm

ID3: maximizing information gain
C4.5: maximizing the ratio of information gain
CART
- Regression Tree: minimizing the square error.
- Classification Tree: minimizing the Gini index.

Decision Tree Algorithm Pseudocode

Place the best attribute of the dataset at the root of tree.The way to the selection of best attribute is shown in Three Building Algorithm above.
Split the train set into subset by the best attribute.
Repeat Step 1 and Step 2 on each subset until you find leaf nodes in all the branches of the tree.

Random Forest

Random Forest classifiers work around that limitation by creating a whole bunch of decision trees(hence 'forest'), each trained on random subsets of training samples(bagging, drawn with replacement) and features(drawn without replacement).Make the decision tree work together to get result.
In one word, it build on CART with randomness.

Randomness 1:train the tree on the subsets of train set selected by bagging(sampling with replacement).
Randomness 2:train the tree on the subsets of features(sampling without replacement). For example, select 10 features from 100 features in dataset.
Randomness 3:add new feature by low-dimensional projection.

后记

装逼想用英文写博客，想借此锻炼自己的写作能力，无情打脸(￣ε(#￣)

Ref:https://clyyuanzi.gitbooks.io/julymlnotes/content/rf.html
http://www.saedsayad.com/decision_tree.htm
http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/
统计学习方法（李航）

Decision Tree的更多相关文章

Spark MLlib - Decision Tree源码分析
http://spark.apache.org/docs/latest/mllib-decision-tree.html 以决策树作为开始,因为简单,而且也比较容易用到,当前的boosting或ran ...
决策树Decision Tree 及实现
Decision Tree 及实现标签: 决策树熵信息增益分类有监督 2014-03-17 12:12 15010人阅读评论(41) 收藏举报分类: Data Mining(25) Pyt ...
Gradient Boosting Decision Tree学习
Gradient Boosting Decision Tree,即梯度提升树,简称GBDT,也叫GBRT(Gradient Boosting Regression Tree),也称为Multiple ...
使用Decision Tree对MNIST数据集进行实验
使用的Decision Tree中,对MNIST中的灰度值进行了0/1处理,方便来进行分类和计算熵. 使用较少的测试数据测试了在对灰度值进行多分类的情况下,分类结果的正确率如何.实验结果如下. #Te ...
Sklearn库例子1：Sklearn库中AdaBoost和Decision Tree运行结果的比较
DisCrete Versus Real AdaBoost 关于Discrete 和Real AdaBoost 可以参考博客:http://www.cnblogs.com/jcchen1987/p/4 ...
用于分类的决策树(Decision Tree)-ID3 C4.5
决策树(Decision Tree)是一种基本的分类与回归方法(ID3.C4.5和基于 Gini 的 CART 可用于分类,CART还可用于回归).决策树在分类过程中,表示的是基于特征对实例进行划分, ...
OpenCV码源笔记——Decision Tree决策树
来自OpenCV2.3.1 sample/c/mushroom.cpp 1.首先读入agaricus-lepiota.data的训练样本. 样本中第一项是e或p代表有毒或无毒的标志位:其他是特征,可以 ...
GBDT(Gradient Boosting Decision Tree)算法&协同过滤算法
GBDT(Gradient Boosting Decision Tree)算法参考:http://blog.csdn.net/dark_scope/article/details/24863289 理 ...
Gradient Boost Decision Tree(&Treelink)
http://www.cnblogs.com/joneswood/archive/2012/03/04/2379615.html 1. 什么是Treelink Treelink是阿里集团内部 ...
(转)Decision Tree
Decision Tree:Analysis 大家有没有玩过猜猜看(Twenty Questions)的游戏?我在心里想一件物体,你可以用一些问题来确定我心里想的这个物体:如是不是植物?是否会飞?能游 ...

随机推荐

ITFriend创业败局(三)：技术人员创业，需要尽可能避免，或者需要解决的5个重要问题
一.插科打诨: 本想给小雷粉,做一个创业"成功案例"的,结果做成了一个"反面教材"~ No zuo,no die~ 二.写作目的:分享自己作为一名技术人员,或者 ...
Linq知识小总结
一.投影操作符 Select Select操作符对单个序列或集合中的值进行投影. 返回 IEnumerable<类名> //查询语法 var query = from e in db.Em ...
0-1分布（伯努利分布）、n 重伯努利分布（二项分布）
1. 0-1 分布(伯努利分布) 0-1分布又名两点分布,或叫伯努利分布. P{X=k}=pk(1−p)1−k 其中 k=0,1. 伯努利分布未必一定是 0-1 分布,也可能是 a-b 分布,只需满足 ...
制作WPF时钟之2
原文:制作WPF时钟之2 前段时间写了一篇"制作简单的WPF时钟",今天再制作了一个更漂亮的WPF时钟,目前仅完成了设计部分,准备将它制作成一个无边框窗体式的时钟. 效果图: ...
js父窗体关闭，子窗体紧随
近来的.我们遇到了权限管理系统.由于权限管理系统与原系统的风格不符.打开一个全新的窗口.问题就来了.admin取消后,,权限管理形式不关闭.其他普通用户登录后.尚能经营权的管理形式. 简化问题:adm ...
Clojure实现的简单短网址服务（Compojure、Ring、Korma库演示样例）
用clojure写了一个简单的短网址服务(一半抄自<Clojure 编程>).在那基础上增加了数据库,来持久化数据. 功能用Get方法缩短一个网址: 然后在短网址列表就能够查看了, 接下 ...
超平面（hyperplane）的定义
Hyperplane - Wikipedia Hyperplane – from Wolfram MathWorld a1,a2,-,an 为一组不全为 0 的纯量,如下定义的集合 S 由这样的向量构 ...
Matlab Tricks（十九）—— 序列左右移的实现
比如实现如下的移位操作: y(n)=x(n−k) function [y, n] = sigshift(x, m, k) n = m + k; y = x; 本身任意一个 matlab 序列本质上都是 ...
（记录）mysql分页查询，参数化过程的坑
在最近的工作中,由于历史遗留,一个分页查询没有参数化,被查出来有sql注入危险,所以对这个查询进行了参数化修改. 一看不知道,看了吓一跳,可能由于种种原因,分页查询sql是在存储过程中拼接出来的,wh ...
word 软换行与硬换行
word 下的软回车,就是按住 Shift+Enter 之后产生的一种效果,通常在文字后面会有一个向下的箭头: 硬回车就是只敲击回车(enter)产生的一种效果了,通常就会在文字后面产生一个向左弯区的 ...