李宏毅老师机器学习课程笔记_ML Lecture 1: ML Lecture 1: Regression

引言：

最近开始学习“机器学习”，早就听说祖国宝岛的李宏毅老师的大名，一直没有时间看他的系列课程。今天听了一课，感觉非常棒，通俗易懂，而又能够抓住重点，中间还能加上一些很有趣的例子加深学生的印象。

视频链接（bilibili）：李宏毅机器学习(2017)

另外已经有有心的同学做了速记并更新在github上：李宏毅机器学习笔记(LeeML-Notes)

所以，接下来我的笔记只记录一些我自己的总结和听课当时的困惑，如果有能够帮我解答的朋友也请多多指教。

学习机器学习，先从demo侠做起吧，这个demo是完全复现的李老师demo

import numpy as np

import matplotlib.pyplot as plt

x_data = [ 338., 333., 328., 207., 226., 25., 179., 60., 208.,  606. ]

y_data = [ 640., 633., 619., 393., 428., 27., 193., 66., 226., 1591. ]

# y_data = b + w * x_data

x = np.arange(-200, -100, 1) # bias

y = np.arange(-5, 5, 0.1) # weight

Z = np.zeros((len(x), len(y)))

X, Y = np.meshgrid(x, y)

for i in range(len(x)):

    for j in range((len(y))):

        b = x[i]

        w = y[j]

        Z[j][i] = 0

        for n in range(len(x_data)):

           Z[j][i] = Z[j][i] +(y_data[n] - b - w*x_data[n])**2

        Z[j][i] = Z[j][i]/len(x_data)

b = -129 # intialize b

w = -4 # intialize w

lr = 0.0000001 # learning rate

iteration = 100000

# Store intial values for plotting

b_history = [b]

w_history = [w]

# Iteration

for i in range(iteration):

    b_grad = 0.0

    w_grad = 0.0

    for n in range(len(x_data)):

        b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0

        w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]

    # Update parameters

    b = b - lr * b_grad

    w = w - lr * w_grad

    # Store the parameters for plotting

    b_history.append(b)

    w_history.append(w)

# plot the figure

plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))

plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')

plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')

plt.xlim(-200, -100)

plt.ylim(-5,5)

plt.xlabel(r'$b$', fontsize=16)

plt.ylabel(r'$w$', fontsize=16)

plt.show()

输出结果为：

横坐标是b，纵坐标是w，标记×位最优解，显然，在图中我们并没有运行得到最优解，最优解十分的遥远。那么我们就调大learning rate，lr = 0.000001（调大10倍），得到结果如图2。

#### change the lr to 0.000001

b = -129 # intialize b

w = -4 # intialize w

lr = 0.000001 # learning rate

iteration = 100000

# Store intial values for plotting

b_history = [b]

w_history = [w]

# Iteration

for i in range(iteration):

    b_grad = 0.0

    w_grad = 0.0

    for n in range(len(x_data)):

        b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0

        w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]

    # Update parameters

    b = b - lr * b_grad

    w = w - lr * w_grad

    # Store the parameters for plotting

    b_history.append(b)

    w_history.append(w)

# plot the figure

plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))

plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')

plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')

plt.xlim(-200, -100)

plt.ylim(-5,5)

plt.xlabel(r'$b$', fontsize=16)

plt.ylabel(r'$w$', fontsize=16)

plt.show()

我们再调大learning rate，lr = 0.00001（调大10倍），得到结果如图3。

#### change the lr to 0.00001

b = -129 # intialize b

w = -4 # intialize w

lr = 0.00001 # learning rate

iteration = 100000

# Store intial values for plotting

b_history = [b]

w_history = [w]

# Iteration

for i in range(iteration):

    b_grad = 0.0

    w_grad = 0.0

    for n in range(len(x_data)):

        b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0

        w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]

    # Update parameters

    b = b - lr * b_grad

    w = w - lr * w_grad

    # Store the parameters for plotting

    b_history.append(b)

    w_history.append(w)

# plot the figure

plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))

plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')

plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')

plt.xlim(-200, -100)

plt.ylim(-5,5)

plt.xlabel(r'$b$', fontsize=16)

plt.ylabel(r'$w$', fontsize=16)

plt.show()

一开始设置学习率为0.0000001，经过10万次迭代，发现离最优解还挺远，说明学习率太小，然后将学习率调整为0.000001，扩大了10倍，但是这个时候我们发现，学习率有发生了震荡，但是比之前的结果好了一点，更加接近我们的最优解。然后我们又将学习率增大了十倍，发现最终结果已经超出了整个图纸，完全震荡了，找不到最优解了。

解决办法是：客制化b、w不同的学习率，这种方法称之为AdaGrad

#### using adagrad to solve this problem

b = -129 # intialize b

w = -4 # intialize w

lr = 1 # learning rate

iteration = 100000

b_lr = 0.0

w_lr = 0.0

# Store intial values for plotting

b_history = [b]

w_history = [w]

# Iteration

for i in range(iteration):

    b_grad = 0.0

    w_grad = 0.0

    for n in range(len(x_data)):

        b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0

        w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]

    b_lr = b_lr + b_grad**2

    w_lr = w_lr + w_grad**2

    # Update parameters

    b = b - lr/np.sqrt(b_lr) * b_grad

    w = w - lr/np.sqrt(w_lr) * w_grad

    # Store the parameters for plotting

    b_history.append(b)

    w_history.append(w)

# plot the figure

plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))

plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')

plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')

plt.xlim(-200, -100)

plt.ylim(-5,5)

plt.xlabel(r'$b$', fontsize=16)

plt.ylabel(r'$w$', fontsize=16)

plt.show()

最后的结果如图4：

李宏毅老师机器学习课程笔记_ML Lecture 1: ML Lecture 1: Regression - Demo的更多相关文章

李宏毅老师机器学习课程笔记_ML Lecture 3-1: Gradient Descent
引言: 这个系列的笔记是台大李宏毅老师机器学习的课程笔记视频链接(bilibili):李宏毅机器学习(2017) 另外已经有有心的同学做了速记并更新在github上:李宏毅机器学习笔记(LeeML- ...
李宏毅老师机器学习课程笔记_ML Lecture 2: Where does the error come from?
引言: 最近开始学习"机器学习",早就听说祖国宝岛的李宏毅老师的大名,一直没有时间看他的系列课程.今天听了一课,感觉非常棒,通俗易懂,而又能够抓住重点,中间还能加上一些很有趣的例子 ...
李宏毅老师机器学习课程笔记_ML Lecture 1: 回归案例研究
引言: 最近开始学习"机器学习",早就听说祖国宝岛的李宏毅老师的大名,一直没有时间看他的系列课程.今天听了一课,感觉非常棒,通俗易懂,而又能够抓住重点,中间还能加上一些很有趣的例子 ...
李宏毅老师机器学习课程笔记_ML Lecture 0-2: Why we need to learn machine learning?
引言: 最近开始学习"机器学习",早就听说祖国宝岛的李宏毅老师的大名,一直没有时间看他的系列课程.今天听了一课,感觉非常棒,通俗易懂,而又能够抓住重点,中间还能加上一些很有趣的例子 ...
李宏毅老师机器学习课程笔记_ML Lecture 0-1: Introduction of Machine Learning
引言: 最近开始学习"机器学习",早就听说祖国宝岛的李宏毅老师的大名,一直没有时间看他的系列课程.今天听了一课,感觉非常棒,通俗易懂,而又能够抓住重点,中间还能加上一些很有趣的例子 ...
Andrew 机器学习课程笔记
Andrew 机器学习课程笔记完成 Andrew 的课程结束至今已有一段时间,课程介绍深入浅出,很好的解释了模型的基本原理以及应用.在我看来这是个很好的入门视频,他老人家现在又出了一门 deep l ...
Andrew Ng机器学习课程笔记（四）之神经网络
Andrew Ng机器学习课程笔记(四)之神经网络版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7365730.html 前言 ...
【读书笔记与思考】Andrew 机器学习课程笔记
Andrew 机器学习课程笔记完成 Andrew 的课程结束至今已有一段时间,课程介绍深入浅出,很好的解释了模型的基本原理以及应用.在我看来这是个很好的入门视频,他老人家现在又出了一门 deep l ...
Andrew Ng机器学习课程笔记（五）之应用机器学习的建议
Andrew Ng机器学习课程笔记(五)之应用机器学习的建议版权声明:本文为博主原创文章,转载请指明转载地址 http://www.cnblogs.com/fydeblog/p/7368472.h ...

随机推荐

ADO.NET中DataTable类的使用
DataTable类将关系数据表示为表格形式.在创建DataTable之前,必须包含System.Data名称空间.ADO.NET提供了一个DataTable类来独立创建和使用数据表.它也可以和Dat ...
配置VSCode的C/C++语言功能
0. 前言主要是在网上找的方法都没试成功过,在各种机缘巧合下终于成功了. 这篇文章基于个人经验,而且没有走寻常路. 1. 需要的软件和插件软件: VSCode (https://code.visu ...
oracle根据特定字符拆分字符串的方法
清洗数据需要将某个字段内以空格分隔的字符串拆分成多行单个的字符串,百度了很多种方法大概归结起来也就这几种方法最为有效,现在把贴出来: 第一种: select regexp_substr('1 2 3' ...
vs code开发python时找不到当前目录下的文件、UnicodeDecodeError: 'gbk'
一.vs code开发python时找不到当前目录下的文件, file = open("readme.txt")一直报错,找不到目录下面的文件原来vscode 默认都是以打开的项 ...
Apache Tomcat 文件包含漏洞（CVE-2020-1938）
2月20日,国家信息安全漏洞共享平台(CNVD)发布了Apache Tomcat文件包含漏洞(CNVD-2020-10487/CVE-2020-1938).该漏洞是由于Tomcat AJP协议存在缺陷 ...
C++ STL迭代器原理和简单实现
1. 迭代器简介为了提高C++编程的效率,STL(Standard Template Library)中提供了许多容器,包括vector.list.map.set等.然而有些容器(vector)可以 ...
数据挖掘入门系列教程（四）之基于scikit-lean实现决策树
目录数据挖掘入门系列教程(四)之基于scikit-lean决策树处理Iris 加载数据集数据特征训练随机森林调参工程师结尾数据挖掘入门系列教程(四)之基于scikit-lean决策树处理 ...
简说Python之图形初体验
针对孩子,最容易引起小孩的感官认知的就是图形.因此,系统运用图形编程,可以更好地让孩子喜欢上编程. turtle叫做,Turtle graphics.是python第三方的画图模块工具.可以通过imp ...
git回滚到任意一个版本
1.首先查找提交的记录(-3表示显示最近的3条) git log -3 2.强制回滚到制定版本 git reset --hard 制定版本commitId 如:git reset --hard 4ba ...
Geotools在shapefile路网数据中建立缓冲区，并获取缓冲区内的要素
记录一下如何创建创建缓冲区并获取缓冲区内的要素,便于以后查找使用 static SimpleFeatureSource featureSource = null; static CoordinateR ...

李宏毅老师机器学习课程笔记_ML Lecture 1: ML Lecture 1: Regression - Demo

引言：

李宏毅老师机器学习课程笔记_ML Lecture 1: ML Lecture 1: Regression - Demo的更多相关文章

随机推荐

热门专题