Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
2019-07-15 22:23:02
Paper: https://arxiv.org/pdf/1801.01290.pdf or Updated Version: https://arxiv.org/pdf/1812.05905.pdf
Project: https://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/
TensorFlow: https://github.com/haarnoja/sac
PyTorch: https://github.com/vitchyr/rlkit
Demo video: https://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA
Good Related Blog: https://zhuanlan.zhihu.com/p/70360272
==== Video Related Tutorials (A2C, A3C):
A brief review of Actor-Critic Algorithms: https://www.youtube.com/watch?v=aODdNpihRwM
CS885 Lecture 7b: Actor Critic: https://www.youtube.com/watch?v=5Ke-d1Itk3k
DRL Lecture 6: Actor-Critic: https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s
Build an A2C agent that learns to play Sonic with Tensorflow (tutorial): https://www.youtube.com/watch?v=GCfUdkCL7FQ
Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind): https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s
Actor Critic (A3C) Tutorial: https://www.youtube.com/watch?v=O5BlozCJBSE
Actor Critic Algorithms: https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s
==
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor的更多相关文章
- 18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章 深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
- (zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
- (转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
- 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
- 深度强化学习:入门(Deep Reinforcement Learning: Scratching the surface)
RL的方案 两个主要对象:Agent和Environment Agent观察Environment,做出Action,这个Action会对Environment造成一定影响和改变,继而Agent会从新 ...
- 深度强化学习(Deep Reinforcement Learning)入门:RL base & DQN-DDPG-A3C introduction
转自https://zhuanlan.zhihu.com/p/25239682 过去的一段时间在深度强化学习领域投入了不少精力,工作中也在应用DRL解决业务问题.子曰:温故而知新,在进一步深入研究和应 ...
- (转) Deep Reinforcement Learning: Playing a Racing Game
Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...
- Deep Reinforcement Learning with Iterative Shift for Visual Tracking
Deep Reinforcement Learning with Iterative Shift for Visual Tracking 2019-07-30 14:55:31 Paper: http ...
- 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...
随机推荐
- js 判断浏览器是pc端还是移动端
if(/Android|webOS|iPhone|iPod|BlackBerry/i.test(navigator.userAgent)) { //说明是移动端 } else { //说明是pc端 }
- LXC容器
1. LXC简述 Linux container是一种资源隔离机制而非虚拟化技术.VMM(VMM Virtual Machine Monitor)或者叫Hypervisor是标准的虚拟化技术,这 ...
- vim 配置遇到的问题
1 使用 Vundle 安装插件时提示输入 github 账户密码 .vimrc 中 Plugin ‘路径' 的路径填写错误,仔细检查下 2 在 vim 中执行 shell 命令(如 ls)会闪退 . ...
- 华为OSPF与ACL综合应用实例
实验要求1.企业内网运行OSPF路由协议,区域规划如图所示:2.财务和研发所在的区域不受其他区域链路不稳定性影响:3.R1.R2.R3只允许被IT登录管理:4.YF和CW之间不能互通,但都可以与IT互 ...
- Pthon魔术方法(Magic Methods)-容器相关方法
Pthon魔术方法(Magic Methods)-容器相关方法 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.容器相关方法汇总 __len__: 内建函数len(),返回对象的 ...
- 利用Git钩子实现代码发布
目录 1.什么是git钩子 2.安装一个钩子 3.常用的钩子脚本类型 3.1 客户端钩子 3.1.1 pre-commit 3.1.2 prepare-commit-msg 3.1.3 commit- ...
- SQL模糊查询的四种匹配模式
执行数据库查询时,有完整查询和模糊查询之分,一般模糊语句如下: SELECT 字段 FROM 表 WHERE 某字段 Like 条件 一.四种匹配模式 关于条件,SQL提供了四种匹配模式: 1.% 表 ...
- [转]kafka要等一段时间才能消费到数据
kafka要等一段时间才能消费到数据 pythonkafka 为什么用python写的kafka客户端脚本,程序一运行就能生产数据,而要等一段时间才能消费到数据(topic里面有数据).(pyk ...
- [POJ2083] Fracal
Description A fractal is an object or quantity that displays self-similarity, in a somewhat technica ...
- delete properties inside object