Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

2019-07-15 22:23:02

Paper: https://arxiv.org/pdf/1801.01290.pdf or Updated Version: https://arxiv.org/pdf/1812.05905.pdf

Project: https://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/

TensorFlow: https://github.com/haarnoja/sac

PyTorch: https://github.com/vitchyr/rlkit

Demo video: https://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA

Good Related Blog: https://zhuanlan.zhihu.com/p/70360272

==== Video Related Tutorials (A2C, A3C):

A brief review of Actor-Critic Algorithms: 　　https://www.youtube.com/watch?v=aODdNpihRwM

CS885 Lecture 7b: Actor Critic: 　　　　　　 https://www.youtube.com/watch?v=5Ke-d1Itk3k

DRL Lecture 6: Actor-Critic: 　　　　　　　 https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s

Build an A2C agent that learns to play Sonic with Tensorflow (tutorial): 　　https://www.youtube.com/watch?v=GCfUdkCL7FQ

Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind): 　　 https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s

Actor Critic (A3C) Tutorial: 　　　　　　　　https://www.youtube.com/watch?v=O5BlozCJBSE

Actor Critic Algorithms: 　　　　　　　　　 https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor的更多相关文章

18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
(zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
深度强化学习：入门(Deep Reinforcement Learning: Scratching the surface)
RL的方案两个主要对象:Agent和Environment Agent观察Environment,做出Action,这个Action会对Environment造成一定影响和改变,继而Agent会从新 ...
深度强化学习（Deep Reinforcement Learning）入门：RL base & DQN-DDPG-A3C introduction
转自https://zhuanlan.zhihu.com/p/25239682 过去的一段时间在深度强化学习领域投入了不少精力,工作中也在应用DRL解决业务问题.子曰:温故而知新,在进一步深入研究和应 ...
(转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
Deep Reinforcement Learning with Iterative Shift for Visual Tracking
Deep Reinforcement Learning with Iterative Shift for Visual Tracking 2019-07-30 14:55:31 Paper: http ...
论文笔记之：Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...

随机推荐

android之自定义viewGroup仿scrollView的两种实现（滚动跟粘性）
最近一直在研究自定义控件,一般大致分为三种情况:自绘控件,组合控件,继承控件.接下来我们来看下继承控件.在此借鉴一位博主的文章,补充粘性的实现效果,并且加注自己的一些理解.大家也可以查看原文博客.an ...
linux查看log软件
可以使用LNAV软件查看log,还是比较方便的安装步骤 $ sudo apt install lnav 获取帮助信息 $ lnav -h 查看日志 $ lnav 查看指定日志(后面加上绝对路径) $ ...
折叠面板实现，上传文件进度条，三级联选择器，多级联选择器，利用layui实现
首先贴出html代码 <form class="layui-form" action=""> <div class="layui-f ...
JWT对SpringCloud进行系统认证和服务鉴权
JWT对SpringCloud进行系统认证和服务鉴权一.为什么要使用jwt?在微服务架构下的服务基本都是无状态的,传统的使用session的方式不再适用,如果使用的话需要做同步session机制,所 ...
【监控】jvisualvm之jmx远程连接 jar启动应用
一.Java -jar启动添加如下参数就可以了 -Dcom.sun.management.jmxremote -Djava.rmi.server.hostname=127.0.0.1 -Dcom.su ...
SHELL脚本编程-普通数组(列表)和关联数组(字典)
SHELL脚本编程-普通数组(列表)和关联数组(字典) 作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.数组相关概述变量: 存储单个元素的内存空间数组: 存储多个元素的连续的 ...
Flume实战案例运维篇
Flume实战案例运维篇作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Flume概述 1>.什么是Flume Flume是一个分布式.可靠.高可用的海量日志聚合系统,支 ...
C#锁对象代码
private static readonly object SequenceLock = new object(); private static readonly object SequenceL ...
Linux入侵类问题排查思路
深入分析,查找入侵原因一.检查隐藏帐户及弱口令检查服务器系统及应用帐户是否存在弱口令: 检查说明:检查管理员帐户.数据库帐户.MySQL 帐户.tomcat 帐户.网站后台管理员帐户等密码设置是 ...
C# CefSharp如何在Winforms应用程序中使用
最近做了一个很小的功能,在网页上面打开应用程序,用vs的debug调试,可以正常打开应用程序,可布置到iis上面却无法运行应用程序,吾百度之,说是iis权限问题,吾依理做之,可怎么折腾也不行.最后bo ...

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor的更多相关文章

随机推荐

热门专题