蒙特卡洛法计算定积分—Importance Sampling

如上图所示，计算区间[a b]上f(x)的积分即求曲线与X轴围成红色区域的面积。下面使用蒙特卡洛法计算区间[2 3]上的定积分：∫(x²+4*x*sin(x))dx

 # -*- coding: utf-8 -*-

 import numpy as np

 import matplotlib.pyplot as plt

 def f(x):

     return x**2 + 4*x*np.sin(x) 

 def intf(x):

     return x**3/3.0+4.0*np.sin(x) - 4.0*x*np.cos(x)

 a = 2;

 b = 3; 

 # use N draws

 N= 10000

 X = np.random.uniform(low=a, high=b, size=N) # N values uniformly drawn from a to b

 Y =f(X)   # CALCULATE THE f(x) 

 # 蒙特卡洛法计算定积分：面积=宽度*平均高度

 Imc= (b-a) * np.sum(Y)/ N;

 exactval=intf(b)-intf(a)

 print "Monte Carlo estimation=",Imc, "Exact number=", intf(b)-intf(a)

 # --How does the accuracy depends on the number of points(samples)? Lets try the same 1-D integral

 # The Monte Carlo methods yield approximate answers whose accuracy depends on the number of draws.

 Imc=np.zeros(1000)

 Na = np.linspace(0,1000,1000)

 exactval= intf(b)-intf(a)

 for N in np.arange(0,1000):

     X = np.random.uniform(low=a, high=b, size=N) # N values uniformly drawn from a to b

     Y =f(X)   # CALCULATE THE f(x)

     Imc[N]= (b-a) * np.sum(Y)/ N;

 plt.plot(Na[10:],np.sqrt((Imc[10:]-exactval)**2), alpha=0.7)

 plt.plot(Na[10:], 1/np.sqrt(Na[10:]), 'r')

 plt.xlabel("N")

 plt.ylabel("sqrt((Imc-ExactValue)$^2$)")

 plt.show()

>>>

Monte Carlo estimation= 11.8181144118 Exact number= 11.8113589251

从上图可以看出，随着采样点数的增加，计算误差逐渐减小。想要提高模拟结果的精确度有两个途径：其一是增加试验次数N；其二是降低方差σ². 增加试验次数势必使解题所用计算机的总时间增加，要想以此来达到提高精度之目的显然是不合适的。下面来介绍重要抽样法来减小方差，提高积分计算的精度。

重要性抽样法的特点在于，它不是从给定的过程的概率分布抽样，而是从修改的概率分布抽样，使对模拟结果有重要作用的事件更多出现，从而提高抽样效率，减少花费在对模拟结果无关紧要的事件上的计算时间。比如在区间[a b]上求g(x)的积分，若采用均匀抽样，在函数值g(x)比较小的区间内产生的抽样点跟函数值较大处区间内产生的抽样点的数目接近，显然抽样效率不高，可以将抽样概率密度函数改为f(x)，使f(x)与g(x)的形状相近，就可以保证对积分计算贡献较大的抽样值出现的机会大于贡献小的抽样值，即可以将积分运算改写为：

x是按照概率密度f(x)抽样获得的随机变量，显然在区间[a b]内应该有：

因此，可容易将积分值I看成是随机变量 Y = g(x)/f(x)的期望，式子中x_i是服从概率密度f(x)的采样点

下面的例子采用一个正态分布函数f(x)来近似g(x)=sin(x)*x，并依据正态分布选取采样值计算区间[0 pi]上的积分个∫g(x)dx

 # -*- coding: utf-8 -*-

 # Example: Calculate ∫sin(x)xdx

 # The function has a shape that is similar to Gaussian and therefore

 # we choose here a Gaussian as importance sampling distribution.

 from scipy import stats

 from scipy.stats import norm

 import numpy as np

 import matplotlib.pyplot as plt

 mu = 2;

 sig =.7;

 f = lambda x: np.sin(x)*x

 infun = lambda x: np.sin(x)-x*np.cos(x)

 p = lambda x: (1/np.sqrt(2*np.pi*sig**2))*np.exp(-(x-mu)**2/(2.0*sig**2))

 normfun = lambda x:  norm.cdf(x-mu, scale=sig)

 plt.figure(figsize=(18,8))  # set the figure size

 # range of integration

 xmax =np.pi

 xmin =0

 # Number of draws

 N =1000

 # Just want to plot the function

 x=np.linspace(xmin, xmax, 1000)

 plt.subplot(1,2,1)

 plt.plot(x, f(x), 'b', label=u'Original  $x\sin(x)$')

 plt.plot(x, p(x), 'r', label=u'Importance Sampling Function: Normal')

 plt.xlabel('x')

 plt.legend()

 # =============================================

 # EXACT SOLUTION

 # =============================================

 Iexact = infun(xmax)-infun(xmin)

 print Iexact

 # ============================================

 # VANILLA MONTE CARLO

 # ============================================

 Ivmc = np.zeros(1000)

 for k in np.arange(0,1000):

     x = np.random.uniform(low=xmin, high=xmax, size=N)

     Ivmc[k] = (xmax-xmin)*np.mean(f(x))

 # ============================================

 # IMPORTANCE SAMPLING

 # ============================================

 # CHOOSE Gaussian so it similar to the original functions

 # Importance sampling: choose the random points so that

 # more points are chosen around the peak, less where the integrand is small.

 Iis = np.zeros(1000)

 for k in np.arange(0,1000):

     # DRAW FROM THE GAUSSIAN: xis~N(mu,sig^2)

     xis = mu + sig*np.random.randn(N,1);

     xis = xis[ (xis<xmax) & (xis>xmin)] ;

     # normalization for gaussian from 0..pi

     normal = normfun(np.pi)-normfun(0)      # 注意:概率密度函数在采样区间[0 pi]上的积分需要等于1

     Iis[k] =np.mean(f(xis)/p(xis))*normal   # 因此,此处需要乘一个系数即p(x)在[0 pi]上的积分

 plt.subplot(1,2,2)

 plt.hist(Iis,30, histtype='step', label=u'Importance Sampling');

 plt.hist(Ivmc, 30, color='r',histtype='step', label=u'Vanilla MC');

 plt.vlines(np.pi, 0, 100, color='g', linestyle='dashed')

 plt.legend()

 plt.show()

从图中可以看出曲线sin(x)*x的形状和正态分布曲线的形状相近，因此在曲线峰值处的采样点数目会比曲线上位置低的地方要多。精确计算的结果为pi，从上面的右图中可以看出：两种方法均计算定积分1000次，靠近精确值pi=3.1415处的结果最多，离精确值越远数目越少，显然这符合常规。但是采用传统方法(红色直方图)计算出的积分值方的差明显比采用重要抽样法(蓝色直方图)要大。因此，采用重要抽样法计算可以降低方差，提高精度。另外需要注意的是：关于函数f(x)的选择会对计算结果的精度产生影响，当我们选择的函数f(x)与g(x)相差较大时，计算结果的方差也会加大。

参考：

http://iacs-courses.seas.harvard.edu/courses/am207/blog/lecture-3.html

蒙特卡洛法计算定积分—Importance Sampling的更多相关文章

Not All Samples Are Created Equal: Deep Learning with Importance Sampling
目录概主要内容 "代码" Katharopoulos A, Fleuret F. Not All Samples Are Created Equal: Deep Learnin ...
Implemented the “Importance Sampling of Reflections from Hair Fibers”
Just the indirect specular pass by importance sampling. With all layers. Manually traced by 3D Ham ...
[Bayes] Hist & line: Reject Sampling and Importance Sampling
吻合度蛮高,但不光滑. > L= > K=/ > x=runif(L) > *x*(-x)^/K)) > hist(x[ind],probability=T, + xla ...
Importance sampling
用蒙特卡洛求解积分时 (Monte Carlo 随机采样对目标积分函数做近似) importance sampling func p(x) p(x)值大的地方,Monte Carlo多采几次值小的地 ...
转如何理解重要性采样(importance sampling)
分类: 我叫学术帖2011-03-25 13:22 3232人阅读评论(4) 收藏举报图形重要性采样是非常有意思的一个方法.我们首先需要明确,这个方法是基于采样的,也就是基于所谓的蒙特卡洛法 ...
小小知识点（二十）利用MATLAB计算定积分
一重定积分 1. Z = trapz(X,Y,dim) 梯形数值积分,通过已知参数x,y按dim维使用梯形公式进行积分 %举例说明1 clc clear all % int(sin(x),0,pi) ...
C++ 计算定积分、不定积分、蒙特卡洛积分法
封装成了一个类,头文件和源文件如下: integral.h #pragma once //Microsoft Visual Studio 2015 Enterprise #include <io ...
随机模拟的基本思想和常用采样方法（sampling）
转自:http://blog.csdn.net/xianlingmao/article/details/7768833 引入我们会遇到很多问题无法用分析的方法来求得精确解,例如由于式子特别,真的解不 ...
PRML读书会第十一章 Sampling Methods（MCMC， Markov Chain Monte Carlo，细致平稳条件，Metropolis-Hastings，Gibbs Sampling，Slice Sampling，Hamiltonian MCMC）
主讲人网络上的尼采 (新浪微博: @Nietzsche_复杂网络机器学习) 网络上的尼采(813394698) 9:05:00 今天的主要内容:Markov Chain Monte Carlo,M ...

随机推荐

Watir资源列表【转】
Watir简介 "Watir" (发音与 water相近) 全写是 "Web Application Testing in Ruby".Watir是一款用Rub ...
sql创建删除修改表的基本操作
1 建立表格在建立好数据库以后,就可以根据储存资料的需求,使用SQL叙述建立所有需要的表格(table).建立表格的设定非常多,以建立"world.city"表格来说,它的叙述会 ...
django migrate10060 Duplicate column name错误
这个错误意思是有重复的列名,其实大部分原因是因为某些列被执行了多次,可以打开migration里面的django生成的文件去排查错误,然后自己手动修改数据库还原,实在不行可以把除了0001和init文 ...
:parent 匹配含有子元素或者文本的元素
描述: 查找所有含有子元素或者文本的 td 元素 HTML 代码: <table> <tr><td>Value 1</td><td>< ...
iOS 学习笔记五（2015.03.17）使用storyBoard进行tableview的跳转
方法一: 点击tableviewCell后,按住ctrl键拖拽至想要跳转的新的界面.这样跳转的结果是,点击tableview中的任何一行都会跳转到新的界面.可以通过控制cell的属性 userInt ...
iOS 解决的问题
1. 字符超过一定长度会闪退. 2. 发送完会弹出警告框. 3. 加入语音. 4. 连接按钮做peripheval是否为空的判断.
安装LAMP
1.首先打开命令行,获得最新的软件包 sudo apt-get install update 2.安装MySQL数据库 sudo apt-get install mysql-server mysql- ...
nohup DEMO
nohup,顾名思义:挂起免疫. nohup命令可以防止当你退出系统时,在后台运行的进程被终止.它能让你运行的命令或脚本在你退出系统后继续在后台运行. nohup命令不能自动的将任务放在后台运行,所以 ...
Composer使用中常见的问题
安装了Composer后,运行 composer --version ,查看Composer的版本号.如果出现下面的提示,那么软件安装成功. Composer version 1.2.0 2016-0 ...
20145227 《Java程序设计》第5周学习总结
20145227 <Java程序设计>第5周学习总结教材学习内容总结第八章异常处理 8.1 语法与继承结构 1.使用try .catch java中所有错误都会被打包为对象,可以尝试 ...

蒙特卡洛法计算定积分—Importance Sampling

蒙特卡洛法计算定积分—Importance Sampling的更多相关文章

随机推荐

热门专题