What are the advantages of logistic regression over decision trees?FAQ

The answer to "Should I ever use learning algorithm (a) over learning algorithm (b)" will pretty much always be yes. Different learning algorithms make different assumptions about the data and have different rates of convergence. The one which works best, i.e. minimizes some cost function of interest (cross validation for example) will be the one that makes assumptions that are consistent with the data and has sufficiently converged to its error rate.

Put in the context of decision trees vs. logistic regression, what are the assumptions made?

Decision trees assume that our decision boundaries are parallel to the axes, for example if we have two features (x1, x2) then it can only create rules such as x1>=4.5, x2>=6.5 etc. which we can visualize as lines parallel to the axis. We see this in practice in the diagram below.

So decision trees chop up the feature space into rectangles (or in higher dimensions, hyper-rectangles). There can be many partitions made and so decision trees naturally scale up to creating more complex (say, higher VC) functions - which can be a problem with over-fitting.

What assumptions does logistic regression make? Despite the probabilistic framework of logistic regression, all that logistic regression assumes is that there is one smooth linear decision boundary. It finds that linear decision boundary by making assumptions that the P(Y|X) of some form, like the inverse logit function applied to a weighted sum of our features. Then it finds the weights by a maximum likelihood approach.

However people get too caught up on that... The decision boundary it creates is a linear* decision boundary that can be of any direction. So if you have data where the decision boundary is not parallel to the axes,

then logistic regression picks it out pretty well, whereas a decision tree will have problems.

So in conclusion,

  • Both algorithms are really fast. There isn't much to distinguish them in terms of run-time.
  • Logistic regression will work better if there's a single decision boundary, not necessarily parallel to the axis.
  • Decision trees can be applied to situations where there's not just one underlying decision boundary, but many, and will work best if the class labels roughly lie in hyper-rectangular regions.
  • Logistic regression is intrinsically simple, it has low variance and so is less prone to over-fitting. Decision trees can be scaled up to be very complex, are are more liable to over-fit. Pruning is applied to avoid this.

Maybe you'll be left thinking, "I wish decision trees didn't have to create rules that are parallel to the axis." This motivates support vector machines.

Footnotes:
* linear in your covariates. If you include non-linear transformations or interactions then it will be non-linear in the space of those original covariates.

What are the advantages of logistic regression over decision trees?FAQ的更多相关文章

  1. Logistic Regression vs Decision Trees vs SVM: Part II

    This is the 2nd part of the series. Read the first part here: Logistic Regression Vs Decision Trees ...

  2. Logistic Regression Vs Decision Trees Vs SVM: Part I

    Classification is one of the major problems that we solve while working on standard business problem ...

  3. Stanford机器学习笔记-2.Logistic Regression

    Content: 2 Logistic Regression. 2.1 Classification. 2.2 Hypothesis representation. 2.2.1 Interpretin ...

  4. [Scikit-learn] 1.1 Generalized Linear Models - Logistic regression & Softmax

    二分类:Logistic regression 多分类:Softmax分类函数 对于损失函数,我们求其最小值, 对于似然函数,我们求其最大值. Logistic是loss function,即: 在逻 ...

  5. Logistic Regression and Gradient Descent

    Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...

  6. Logistic Regression 用于预测马是否生病

    1.利用Logistic regression 进行分类的主要思想 根据现有数据对分类边界线建立回归公式,即寻找最佳拟合参数集,然后进行分类. 2.利用梯度下降找出最佳拟合参数 3.代码实现 # -* ...

  7. 逻辑回归 Logistic Regression

    逻辑回归(Logistic Regression)是广义线性回归的一种.逻辑回归是用来做分类任务的常用算法.分类任务的目标是找一个函数,把观测值匹配到相关的类和标签上.比如一个人有没有病,又因为噪声的 ...

  8. logistic regression与SVM

    Logistic模型和SVM都是用于二分类,现在大概说一下两者的区别 ① 寻找最优超平面的方法不同 形象点说,Logistic模型找的那个超平面,是尽量让所有点都远离它,而SVM寻找的那个超平面,是只 ...

  9. Logistic Regression - Formula Deduction

    Sigmoid Function \[ \sigma(z)=\frac{1}{1+e^{(-z)}} \] feature: axial symmetry: \[ \sigma(z)+ \sigma( ...

随机推荐

  1. HDOJ2003求绝对值

    求绝对值 Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)Total Submis ...

  2. 设计模式——设计模式之禅day2

    接口隔离原则 接口分为两种: ● 实例接口( Object Interface) , 在Java中声明一个类, 然后用new关键字产生一个实例, 它是对一个类型的事物的描述, 这是一种接口. 比如你定 ...

  3. SQL Server的三种物理连接之Hash Join(三)

    简介 在 SQL Server 2012 在一些特殊的例子下会看到下面的图标: Hash Join分为两个阶段,分别为生成和探测阶段. 首先是生成阶段,将输入源中的每一个条目经过散列函数的计算都放到不 ...

  4. 对于Linux和windows的个人的看法

    对于这两个系统,我和众多朋友一样的纠结.接触Linux是从大二就开始的,后来在某机构学习该系统服务器的配置,当时使用的是红帽子9. 经过这么多年的感悟,做过多系统,也用来装过虚拟机,搭建过网络.曾经为 ...

  5. dorado listener属性

    每一个控件都有一个listener属性,可以用来定位一个服务定位表达式,通过这个表达式, 它最终可以映射为spring里面一个javaBean的一个java方法 例如设置DynaView1.view. ...

  6. 【winform】如何在DateTimePicker中显示时分秒

    1. 首先属性里面的Format属性value设置为Custom(默认为Long) 2. 对应的Custom属性value设置为yyyy-MM-dd HH:mm:ss

  7. ThinkPHP图片上传

    ThinkPHP是国内比较流行的轻量级的PHP框架,它在国内流行的一个最主要的因素在于它的说明文档非常健全完善,以及它源码内的注释都是中文的,方便于英语能力较差的程序员学习. 图片上传在网站里是很常用 ...

  8. 什么是PRD、MRD与BRD

    什么是PRD? 什么是MRD? 什么是BRD? 一.PRD的含义 英文简称,PRD(Product Requirement Document),PRD文档中文意思是:产品需求文档. PRD文档是产品项 ...

  9. a标签点击后的虚线框问题

    以前一直用的方法都是: a {outline: none;star:expression(this.onFocus=this.blur());} 后来发现有瑕疵,不完美.体现在页面调用JS动作比较频繁 ...

  10. ItemsControl 使用Grid布局

    ItemsControl控件经常用到,在ItemsPanel里大多是StackPanel,WrapPanel,以下项目演示如何使用Grid用于ItemsControl布局 1.先看运行效果 2.xam ...