Note for video Machine Learning and Data Mining——training vs Testing
Here is the note for lecture five.
There will be several points
1. Training and Testing
Both of these are about data. Training is using the data to get a fine hypothesis, and testing is not.
If we get a final hypothesis and want to test it, it turns to testing.
2. Another way to verify that learning is feasible. Firstly, let me show you an inequlity.
watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveXVtYW8xOTkyMTAwNg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="" style="text-align:center">
As it mentions on note 2, in the inequlity, the complexity of your hypothesis can be reflected by M.
However, M is almost meaningless, and because of this, your hypothesis will be useless.
If we can replace
M with another quantity, and the quantity is not meaningless, that means not infinite, and then we can start
our learning in an actual model.(our learning is feasible)
What is M? It mentioned before that M is the maxnum of hypothesis. So can we figure number of hypothesis to
replace M? The answer turns true.
the maxnum of hypothesis are different choice of different points. If the number of uncertain is a, and the number
of choice for uncertain is b, then the maxnum of hypothesis come out, its a^b.
But it seems not smoothly like that, there are several hypothesis could not be built up,
generlly the number of hypothesis
that can be built are less than a^b.
Let's come back to the inequlity, we can prove it mathematically that
if M can be replaced by a polynomial, that means the number of hypothesis in a set is not infinite, then we can declare that learning is feasible using this hypothesis set. There is a new statement that wil be proved next lecture, if the maxnum of hypothesis
is less than its max-value, the number of hypothesis could be replaced by a polynimial, that is, learning is feasible using the hypothesis set.
According to above statement, if there are several hypothesis can not be built up, then set for the hypothesis will be feasible for learning.
Note for video Machine Learning and Data Mining——training vs Testing的更多相关文章
- Note for video Machine Learning and Data Mining——Linear Model
Here is the note for lecture three. the linear model Linear model is a basic and important model in ...
- Machine Learning and Data Mining Lecture 1
Machine Learning and Data Mining Lecture 1 1. The learning problem - Outline 1.1 Example of mach ...
- How do you explain Machine Learning and Data Mining to non Computer Science people?
How do you explain Machine Learning and Data Mining to non Computer Science people? Pararth Shah, ...
- Machine Learning and Data Mining(机器学习与数据挖掘)
Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...
- Machine Learning and Data Science 教授大师
http://www.cs.cmu.edu/~avrim/courses.html Foundations of Data Science Avrim Blum, www.cs.cornell.edu ...
- Machine Learning、Date Mining、IR&NLP 会议期刊论文推荐
核心期刊排名查询 http://portal.core.edu.au/conf-ranks/ http://portal.core.edu.au/jnl-ranks/ 1.机器学习推荐会议 ICML— ...
- 斯坦福大学公开课机器学习:advice for applying machine learning | model selection and training/validation/test sets(模型选择以及训练集、交叉验证集和测试集的概念)
怎样选用正确的特征构造学习算法或者如何选择学习算法中的正则化参数lambda?这些问题我们称之为模型选择问题. 在对于这一问题的讨论中,我们不仅将数据分为:训练集和测试集,而是将数据分为三个数据组:也 ...
- How do I learn machine learning?
https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644 How Can I Learn X? ...
- How to use data analysis for machine learning (example, part 1)
In my last article, I stated that for practitioners (as opposed to theorists), the real prerequisite ...
随机推荐
- GO语言基础之并发concurrency
并发Concurrency 很多人都是冲着 Go 大肆宣扬的高并发而忍不住跃跃欲试,但其实从源码的解析来看,goroutine 只是由官方实现的超级“线程池”而已.不过话说回来,每个实例 4-5KB的 ...
- MFC apps must not include windows.h
用VS2008建立一个DLL项目,一开始的时候不想用MFC, 所以选择的是使用标准Windows库. 使用了一段时间后又想用MFC了,所以把选项改成使用在共享 DLL 中使用 MFC. 但是编译的时候 ...
- Linux中如何开启8080端口供外界访问
装好Tomcat7后,发现除了本机能访问外界访问不了,岂有此理.于是请教百度大神,在费一番周折后,总结步骤如下: 1.修改文件/etc/sysconfig/iptables [root@bogon ~ ...
- 跨站点脚本编制-XSS 描述及解决方法
跨站点脚本编制可能是一个危险的安全性问题,在设计安全的基于 Web 的应用程序时应该考虑这一点.本文中,描述了这种问题的本质.它是如何起作用的,并概述了一些推荐的修正策略. 当今的大多数网站都对 We ...
- Node FS 读取文件中文乱码解决
1:首先保证源文件编码方式为UTF-8 2:读取代码,设置编码方式rs.setEncoding('utf8') var fs = require('fs'); var rs = fs.createRe ...
- java面试第六天
集合:保存多个其他对象的对象,不能保存简单类型 List:有序(存放元素的顺序),可重复的集合 ArrayList:实质就是一个会自动增长的数组 查询效率比较高,增删的效率比较低,适用于查询比较频繁, ...
- GNU与Linux
GNU是自由软件之父 Richard Stallman在1984年组织开发的一个完全基于自由软件的软件体系,与此相应的有一分通用公共许可证(General Public License,简称GPL). ...
- java格式化百分比
NumberFormat nf = NumberFormat.getPercentInstance(); System.out.println(nf.format(0.47)); 显示:47% Dec ...
- 用Java实现AES加密(坑!)
大坑!使用SecureRandom默认的加密方式即SHA1PRNG生成的密码有误,即使使用相同的password来生成,不同runtime或时刻生成的随机密码也有可能不同,造成的错误为javax.cr ...
- exception PLS-00215: String length constraints must be in range (1 .. 32767)
exception PLS-00215: String length constraints must be in range (1 .. 32767) CreationTime--2018年8月 ...