The component and implementation of a basic gradient descent in python
in my impression, the gradient descent is for finding the independent variable that can get the minimum/maximum value of an objective function. So we need an obj. function: \(\mathcal{L}\)
- an obj. function: \(\mathcal{L}\)
- The gradient of \(\mathcal{L}: 2x+2\)
- \(\Delta x\) , The value of idependent variable needs to be updated: \(x \leftarrow x+\Delta x\)
1. the \(\mathcal{L}\) is a context function: \(f(x)=x^2+2x+1\)
how to find the \(x_0\) that makes the \(f(x)\) has the minimum value, via gradient descent?
Start with an arbitrary \(x\), calculate the value of \(f(x)\) :
import random
def func(x):
return x*x + 2*x +1
def gred(x): # the gradient of f(x)
return 2*x + 2
x = random.uniform(-10.0,10.0) #randomly pick a float in interval of (-10, 10)
# x = 10
print('x starts at:', x)
y0 = func(x) #first cal
delta = 0.5 #the value of delta_x, each iteration
x = x + delta
# === interation ===
for i in range(100):
print('i=',i)
y1 = func(x)
delta = -0.08*gred(x)
print(' delta=',delta)
if y1 > y0:
print(' y1>y0')
# if gred(x) is positive, the x should decrease.
# if gred(x) is negative, the x should increase.
else:
print(' y1<=y0')
# if gred(x) is positive, the x should increase.
# if gred(x) is negative, the x should decrease.
x = x+delta
y0 = y1
print(' x=', x, 'f(x)=', y1)
Let's disscuss how to determin the some_value
in the psudo code above.
if \(y_1-y_0\) has a large positive difference, i.e. \(y1 >> y0\), the x should shift backward heavily. so the some_value
can be a ratio of \((y_1-y_0)\times(-gradient)\) , Let's say, some_value
: \(\lambda = r \times\) gred(x) , here, \(r=0.08\) is the step-size.
The basic gradient descent has many shortcomings which can be found by search the 'shortcoming of gd'.
Another problem of GD algorithm is , What if the \(\mathcal{L}\) does not have explicit expression of its gradient?
Stochastic Gradient Descent(SGD) is another GD algorithm.
The component and implementation of a basic gradient descent in python的更多相关文章
- (转)Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning
Introduction Optimization is always the ultimate goal whether you are dealing with a real life probl ...
- Logistic Regression and Gradient Descent
Logistic Regression and Gradient Descent Logistic regression is an excellent tool to know for classi ...
- (转) An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Table of contents: Gradient descent variants ...
- 机器学习-随机梯度下降(Stochastic gradient descent)
sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...
- An overview of gradient descent optimization algorithms
原文地址:An overview of gradient descent optimization algorithms An overview of gradient descent optimiz ...
- 机器学习数学基础- gradient descent算法(上)
为什么要了解点数学基础 学习大数据分布式计算时多少会涉及到机器学习的算法,所以理解一些机器学习基础,有助于理解大数据分布式计算系统(比如spark)的设计.机器学习中一个常见的就是gradient d ...
- flink 批量梯度下降算法线性回归参数求解(Linear Regression with BGD(batch gradient descent) )
1.线性回归 假设线性函数如下: 假设我们有10个样本x1,y1),(x2,y2).....(x10,y10),求解目标就是根据多个样本求解theta0和theta1的最优值. 什么样的θ最好的呢?最 ...
- 梯度下降(Gradient Descent)小结
在求解机器学习算法的模型参数,即无约束优化问题时,梯度下降(Gradient Descent)是最常采用的方法之一,另一种常用的方法是最小二乘法.这里就对梯度下降法做一个完整的总结. 1. 梯度 在微 ...
- 机器学习基础——梯度下降法(Gradient Descent)
机器学习基础--梯度下降法(Gradient Descent) 看了coursea的机器学习课,知道了梯度下降法.一开始只是对其做了下简单的了解.随着内容的深入,发现梯度下降法在很多算法中都用的到,除 ...
随机推荐
- 1.继承(extends)、超类(superClass)、子类(subClass)
注意:继承主要使用的is-a关系 在子类中用一个新的方法来覆盖超类中的方法(override),需要注意的是如果子类之中的方法或者域 被覆盖时,仍然想访问superClass中的方法和域,此时必须使 ...
- Linux下安装Python3的django并配置mysql作为django默认数据库(转载)
我的操作系统为centos6.5 1 首先选择django要使用什么数据库.django1.10默认数据库为sqlite3,本人想使用mysql数据库,但为了测试方便顺便要安装一下sqlite开发包 ...
- Uva10562——Undraw the Trees
上来一看感觉难以下手,仔细想想就是dfs啊!!!! #include <cstdio> #include<iostream> #include<iomanip> # ...
- 窗函数法设计FIR滤波器参数特征表
- macbook远程连接报错no matching cipher found
在.ssh/目录下添加config文件 Host xxx.xxx.xxx.xxx Ciphers 3des-cbc KexAlgorithms +diffie-hellman-group1-sha1 ...
- 【java高级编程】JDK和CGLIB动态代理区别
转载:https://blog.csdn.net/yhl_jxy/article/details/80635012 前言 JDK动态代理实现原理(jdk8):https://blog.csdn.net ...
- 6_7_8_10html-css
Ps: 1.标准流 2.浮动 3.定位 CCS重点 <!DOCTYPE html> <html lang="en"> <head> & ...
- Defender 防卫者
发售年份 1980 平台 街机 开发商 Williams 类型 射击 https://www.youtube.com/watch?v=gss3lxeqCok
- day061 cookie和session
一. cookie 1.cookie 的原理 工作原理是:浏览器访问服务端,带着一个空的cookie,然后由服务器产生内容, 浏览器收到相应后保存在本地:当浏览器再次访问时,浏览器会自动带上Cooki ...
- Hbase数据库
1.简介 HBase从诞生至今将近10年,在apache基金会的孵化下,已经变成一个非常成熟的项目,也有许多不同的公司支持着许多不同的分支版本,如cloudra等等. HBase不同于一般的关 ...