week 3 Classification

  

KNN :基本思想是 input value 类似,就可能是同一类的

  

  

Decision Tree

  

  

  

  

Naive Bayes

  

  

Week 4 Evaluating model


Over-fitting

怎么在Decision Tree 训练时避免 overfitting: Pre-Pruning 和 Post-Pruning

pre-pruning 两个停止条件:1. 某个node上的record数目小于一定量,比如 <20个, 2. 纯度到达一定数值,比如80%, 就不再split了.

怎么取 validation set

holdout 方法如下表示,为了解决training set 和validation set 可能distribution 不同,还有一个引申出来的repeated-holdout

除了 accuracy, error rate, F1, Confusion Matrix

Week 5 Regression, Cluster, Association

Association:

Coursera, Big Data 4, Machine Learning With Big Data (week 3/4/5)的更多相关文章

  1. Coursera, Big Data 4, Machine Learning With Big Data (week 1/2)

    Week 1 Machine Learning with Big Data KNime - GUI based Spark MLlib - inside Spark CRISP-DM Week 2, ...

  2. In machine learning, is more data always better than better algorithms?

    In machine learning, is more data always better than better algorithms? No. There are times when mor ...

  3. [Javascript] Classify JSON text data with machine learning in Natural

    In this lesson, we will learn how to train a Naive Bayes classifier and a Logistic Regression classi ...

  4. Coursera 学习笔记|Machine Learning by Standford University - 吴恩达

    / 20220404 Week 1 - 2 / Chapter 1 - Introduction 1.1 Definition Arthur Samuel The field of study tha ...

  5. [Machine Learning with Python] Data Preparation through Transformation Pipeline

    In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a ...

  6. [Machine Learning with Python] Data Preparation by Pandas and Scikit-Learn

    In this article, we dicuss some main steps in data preparation. Drop Labels Firstly, we drop labels ...

  7. 斯坦福大学公开课机器学习:machine learning system design | data for machine learning(数据量很大时,学习算法表现比较好的原理)

    下图为四种不同算法应用在不同大小数据量时的表现,可以看出,随着数据量的增大,算法的表现趋于接近.即不管多么糟糕的算法,数据量非常大的时候,算法表现也可以很好. 数据量很大时,学习算法表现比较好的原理: ...

  8. [Machine Learning with Python] Data Visualization by Matplotlib Library

    Before you can plot anything, you need to specify which backend Matplotlib should use. The simplest ...

  9. Coursera《machine learning》--(14)数据降维

    本笔记为Coursera在线课程<Machine Learning>中的数据降维章节的笔记. 十四.降维 (Dimensionality Reduction) 14.1 动机一:数据压缩 ...

随机推荐

  1. java网络爬虫基础学习(三)

    尝试直接请求URL获取资源 豆瓣电影 https://movie.douban.com/explore#!type=movie&tag=%E7%83%AD%E9%97%A8&sort= ...

  2. 基于GPS数据建立隐式马尔可夫模型预测目的地

    <Trip destination prediction based on multi-day GPS data>是一篇在2019年,由吉林交通大学团队发表在elsevier期刊上的一篇论 ...

  3. lower_bound( )和upper_bound( )的基本用法

    lower_bound( begin,end,num):从数组的begin位置到end-1位置二分查找第一个大于或等于num的数字,找到返回该数字的地址,不存在则返回end.通过返回的地址减去起始地址 ...

  4. Python之shutil模块(复制移动文件)

    用python实现将某代码文件复制/移动到指定路径下.场景例如:mv ./xxx/git/project1/test.sh ./xxx/tmp/tmp/1/test.sh (相对路径./xxx/tmp ...

  5. ajax属性详解

    https://blog.csdn.net/mooncom/article/details/52402836 资料库: $.ajaxSetup()方法为将来的ajax请求设置默认值. http://w ...

  6. button样式的demo

    <style type="text/css"> .styletop{margin-top: 200px;} .stylea{ margin-left:550px;} ; ...

  7. java数组2

    package lastt; public class last { String name;int age; public last(String name,int age) { this.name ...

  8. POJ1847 Tram

    Tram Time Limit: 1000MS   Memory Limit: 30000K Total Submissions: 20274   Accepted: 7553 Description ...

  9. HDU 4547 CD操作

    传送门 没啥好说的.就是一个LCA. 不过就是有从根到子树里任意一个节点只需要一次操作,特判一下LCA是不是等于v.相等的话不用走.否则就是1次操作. 主要是想写一下倍增的板子. 倍增基于二进制.暴力 ...

  10. 《AutoCAD Civil 3D .NET二次开发》勘误1

    第十三章atc文件中Displayname应为DisplayName,注意Name的N为大写,否则参数名称无法正常显示. 给您带来的不便深表歉意!