PRML读书笔记——Introduction

0xAC 2024-10-29 23:53:30 原文

1.1. Example: Polynomial Curve Fitting

　　1. Movitate a number of concepts:

　　　　(1) linear models: Functions which are linear in the unknow parameters. Polynomail is a linear model. For the Polynomail curve fitting problem, the models is :

　　　　　　　　

　　　　which is a linear model.

　　　　(2) error function: error function measures the misfit between the prediction and the training set point. For instance, sum of the squares of the errors is one simple function, which is widely used, and is given:

　　　　　　　　

　　　　(3) model comparison or model selection

　　　　(4) over-fitting: the model abtains excellent fit to training data and give a very poor performance on test data. And this behavior is known as over-fitting.

　　　　(5) regularization: One technique which is often used to control the over-fitting phenomenon, and it involves adding a penalty term to the error function in order to discourage the coefficients from reaching large values. The simplest such penalty term takes the form of a sum of aquares of all of the coefficients, leading to a modified error function of the form:

　　　　　　　　

And this particular case of a quadratic regularizer is called ridge regression (Hoerl and Kennard, 1970). In the context of neural networks, this approach is known as weight decay.

　　　　(6) validation set, also called a hold-out set: If we were trying to solve a practical application using this approach of minimizing an error function, we would have to find a way to determine a suitable value for the model complexity. a simple way of achieving this, namely by taking the available data and partitioning it into a training set, used to determine the coefficients w, and a separate validation set, also called a hold-out set, used to optimize the model complexity.

1.2. Probability Theory

1. The rules of probability. Sum rule and product rule.

　　　　　

2. Bayes’ theorem.

　　

3. Probability densities

4. Expectations and covariances

5. Bayesian probabilities.

　　Bayes’ theorem was used to convert a prior probability into a posterior probability by incorporating the evidence provided by the observed data.

6. Gaussian distribution

　　

7.maximizing the posterior distribution is equivalent to minimizing the regularized sum-of-squares error function.

1.3. Model Selection

1.6. Information Theory

1 entropy

PRML读书笔记——Introduction的更多相关文章

PRML读书笔记——3 Linear Models for Regression
Linear Basis Function Models 线性模型的一个关键属性是它是参数的一个线性函数,形式如下: w是参数,x可以是原始的数据,也可以是关于原始数据的一个函数值,这个函数就叫bas ...
PRML读书笔记——机器学习导论
什么是模式识别(Pattern Recognition)? 按照Bishop的定义,模式识别就是用机器学习的算法从数据中挖掘出有用的pattern. 人们很早就开始学习如何从大量的数据中发现隐藏在背后 ...
PRML读书笔记——2 Probability Distributions
2.1. Binary Variables 1. Bernoulli distribution, p(x = 1|µ) = µ 2.Binomial distribution + 3.beta dis ...
PRML读书笔记——Mathematical notation
x, a vector, and all vectors are assumed to be column vectors. M, denote matrices. xT, a row vcetor, ...
【PRML读书笔记-Chapter1-Introduction】1.6 Information Theory
熵给定一个离散变量,我们观察它的每一个取值所包含的信息量的大小,因此,我们用来表示信息量的大小,概率分布为.当p(x)=1时,说明这个事件一定会发生,因此,它带给我的信息为0.(因为一定会发生,毫无 ...
【PRML读书笔记-Chapter1-Introduction】1.5 Decision Theory
初体验: 概率论为我们提供了一个衡量和控制不确定性的统一的框架,也就是说计算出了一大堆的概率.那么,如何根据这些计算出的概率得到较好的结果,就是决策论要做的事情. 一个例子: 文中举了一个例子: 给定 ...
【PRML读书笔记-Chapter1-Introduction】1.4 The Curse of Dimensionality
维数灾难给定如下分类问题: 其中x6和x7表示横轴和竖轴(即两个measurements),怎么分? 方法一(simple): 把整个图分成:16个格,当给定一个新的点的时候,就数他所在的格子中,哪 ...
【PRML读书笔记-Chapter1-Introduction】1.3 Model Selection
在训练集上有个好的效果不见得在测试集中效果就好,因为可能存在过拟合(over-fitting)的问题. 如果训练集的数据质量很好,那我们只需对这些有效数据训练处一堆模型,或者对一个模型给定系列的参数值 ...
【PRML读书笔记-Chapter1-Introduction】1.2 Probability Theory
一个例子: 两个盒子: 一个红色:2个苹果,6个橘子; 一个蓝色:3个苹果,1个橘子; 如下图: 现在假设随机选取1个盒子,从中.取一个水果,观察它是属于哪一种水果之后,我们把它从原来的盒子中替换掉. ...

随机推荐

【hdu】p1754I Hate It
I Hate It Time Limit: 9000/3000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total S ...
【原创】windows下搭建vue开发环境+IIS部署
[原创]win10下搭建vue开发环境如果要转发,请注明原作者和原产地,谢谢! 特别说明:下面任何命令都是在windows的命令行工具下进行输入,打开命令行工具的快捷方式如下图: 详细的安 ...
Connect模块解析
Connect模块背景 Node.js的愿望是成为一个能构建高速,可伸缩的网络应用的平台,它本身具有基于事件,异步,非阻塞,回调等特性,这在前几篇专栏中有过描述. 正是基于这样的一些特性,Node.j ...
javamail 收邮件并解析附件
package com.zz.mail; import java.io.*; import java.text.*; import java.util.*; import javax.mail.*; ...
OpenStack 二次开发环境和开发工具的选择
OpenStack网上安装教程很多,就不介绍安装了,OpenStack所有组件都安装完后,dashboard web里面进行一些操作,没有报错或提示权限问题,就可以直接下载pycharm或者eclip ...
Hadoop.2.x_时间服务器搭建(CentOs6.6)
一.检查linux系统NTP是否被安装 [liuwl@hadoop09-linux-01 ~]$ [liuwl@hadoop09-linux-01 ~]$ rpm -qa | grep ntp ntp ...
mysql体系结构
mysql逻辑架构: 第一层,即最上一层,所包含的服务并不是MySQL所独有的技术.它们都是服务于C/S程序或者是这些程序所需要的:连接处理,身份验证,安全性等等. 第二层值得关注.这是MySQL的核 ...
HTML静态网页图片热点、框架、表单
图片热点: 规划出图片上的一个区域,可以做出超链接,直接点击图片区域就可以完成跳转的效果. 示例: 网页划区: 在一个网页里,规划出一个区域用来展示另一个网页的内容. 示例: 框架: 1.frames ...
流式布局&固定宽度&响应式&rem
我们现在在切页面布局的使用常用的单位是px,这是一个绝对单位,web app的屏幕适配有很多中做法,例如:流式布局.限死宽度,还有就是通过响应式来做,但是这些方案都不是最佳的解决方法. 1.流式布局: ...
第 6 章贴近servlet
服务器在获得请求的时候会先根据jsp页面生成一个java文件,然后使用jdk的编译器将此文件编译,最后运行得到的class文件处理用户的请求返回响应.如果再有请求访问这jsp页面,服务器会先检查jsp ...