Machine Learning - Lecture 16

Reinforcement Learning (R.L.)

① MDPs (Markov Decision Processes)

② Value Functions

③ Value Iteration

④ Policy Iteration

(both ③ and ④ are algorithms for solving R.L. problems)

Supervised Learning: we have the training set in which we were given sort of the right answer of every training example and it was the just a drop of the learning algorithms to replicate more of the right answers.

Unsupervised Learning: we had just a bunch of unlabeled data just the x's and it was the job in the learning alogrithm to discover so-called structure in the data and several algorithms like cluster analysis K-means, a mixture of all the sort PCA, ICA and so on.

Today we just talk about a different class of learning algorithms between supervised and unsupervised — R.L.

there's a helicopter experiment performed by Andrew Ng at Stanford University(you could see the video and the details of that experiment on the Internet), which is a unmanned helicopter controlld by R.L. algorithms.

It's different from Supervised Learning, because usually we actually do not konw

Machine Learning - Lecture 16的更多相关文章

ML Lecture 0-1: Introduction of Machine Learning
本博客是针对李宏毅教授在Youtube上上传的课程视频<ML Lecture 0-1: Introduction of Machine Learning>的学习笔记.在Github上也po ...
Stanford CS229 Machine Learning by Andrew Ng
CS229 Machine Learning Stanford Course by Andrew Ng Course material, problem set Matlab code written ...
Machine Learning and Data Mining Lecture 1
Machine Learning and Data Mining Lecture 1 1. The learning problem - Outline 1.1 Example of mach ...
【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 11—Machine Learning System Design 机器学习系统设计
Lecture 11—Machine Learning System Design 11.1 垃圾邮件分类本章中用一个实际例子: 垃圾邮件Spam的分类来描述机器学习系统设计方法.首先来看两封邮件 ...
【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 10—Advice for applying machine learning 机器学习应用建议
Lecture 10—Advice for applying machine learning 10.1 如何调试一个机器学习算法? 有多种方案: 1.获得更多训练数据:2.尝试更少特征:3.尝试更多 ...
ML Lecture 0-2: Why we need to learn machine learning?
在Github上也po了这个系列学习笔记(MachineLearningCourseNote),觉得写的不错的小伙伴欢迎来给项目点个赞哦~~ ML Lecture 0-2: Why we need t ...
【原】Coursera—Andrew Ng机器学习—课程笔记 Lecture 17—Large Scale Machine Learning 大规模机器学习
Lecture17 Large Scale Machine Learning大规模机器学习 17.1 大型数据集的学习 Learning With Large Datasets 如果有一个低方差的模型 ...
【机器学习Machine Learning】资料大全
昨天总结了深度学习的资料,今天把机器学习的资料也总结一下(友情提示:有些网站需要"科学上网"^_^) 推荐几本好书: 1.Pattern Recognition and Machi ...
机器学习(Machine Learning)&深度学习(Deep Learning)资料【转】
转自:机器学习(Machine Learning)&深度学习(Deep Learning)资料 <Brief History of Machine Learning> 介绍:这是一 ...

随机推荐

QString转换为char*
QString在Qt里相当于C++里的std::string,或者是C里的c style string.不过,QString跟编码相关,在低层想把一个QString发送出去相当麻烦,尤其对方用的不是Q ...
js调用swift相册DEMO(网易新闻)
关键代码 window.location.href = 'tg:///openCamera' css body{ } img{ width:100%; } #mainTitle{ text-align ...
Ext的labelWidth默认会给100
Ext的textfield控件的labelWidth属性,如果没有设置这个属性,那么默认会给100,导致左侧有100px的留白
Android-----输入法的显示和隐藏
/** * 控制手机虚拟键盘的显示和隐藏 */public class InputMethodUtil { /** * 隐藏虚拟键盘 * @param v 参数v为获取焦点对象view */ pub ...
Struts2之Action基础与配置
Action基础 Action是什么在Struts2中,一个Action类代表一次请求或调用,每个请求的动作都对应于一个相应的Action类,一个Action类是一个独立的工作单元.也就是,用户的每 ...
2、Khala的安装
于2016年3月24日更新: 一.安装: 1.从github库下载源码https://github.com/moyangvip/khala 2.Khala采用CMake为build system,安装 ...
SqlServer日期查询
一.sql server日期时间函数 Sql Server中的日期与时间函数 1. 当前系统日期.时间 select getdate() 2. dateadd 在向指定日期加上一段时间的基础上,返 ...
Lucene学习总结之四：Lucene索引过程分析
对于Lucene的索引过程,除了将词(Term)写入倒排表并最终写入Lucene的索引文件外,还包括分词(Analyzer)和合并段(merge segments)的过程,本次不包括这两部分,将在以后 ...
关于《Cocos2d-x建工程时避免copy文件夹和库》的更新
在前几篇博文中大概了解了Cocos2d-x引擎的基本结构后打算开始实际操作,便在网上转载了一篇关于VS新建Cocos2d-x项目的文章.今天实际操作的时候发现博主使用的引擎版本和我的不一致(<C ...
project euler 16：Power digit sum
>>> sum([int(i) for i in str(2**1000)]) 1366 >>>

Machine Learning - Lecture 16

Machine Learning - Lecture 16的更多相关文章

随机推荐

热门专题