https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume4/kaelbling96a-html/node26.html

【平均-打折奖励】

Schwartz [106] examined the problem of adapting Q-learning to an average-reward framework. Although his R-learning algorithm seems to exhibit convergence problems for some MDPs, several researchers have found the average-reward criterion closer to the true problem they wish to solve than a discounted criterion and therefore prefer R-learning to Q-learning [69].

To discount or not to discount in reinforcement learning: A case study comparing R learning and Q learning的更多相关文章

  1. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  2. (转) Using the latest advancements in AI to predict stock market movements

    Using the latest advancements in AI to predict stock market movements 2019-01-13 21:31:18 This blog ...

  3. 强化学习(Reinfment Learning) 简介

    本文内容来自以下两个链接: https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/ https: ...

  4. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  5. (转) Deep Learning in a Nutshell: Reinforcement Learning

    Deep Learning in a Nutshell: Reinforcement Learning   Share: Posted on September 8, 2016by Tim Dettm ...

  6. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  7. Deep Reinforcement Learning: Pong from Pixels

    这是一篇迟来很久的关于增强学习(Reinforcement Learning, RL)博文.增强学习最近非常火!你一定有所了解,现在的计算机能不但能够被全自动地训练去玩儿ATARI(译注:一种游戏机) ...

  8. [DQN] What is Deep Reinforcement Learning

    已经成为DL中专门的一派,高大上的样子 Intro: MIT 6.S191 Lecture 6: Deep Reinforcement Learning Course: CS 294: Deep Re ...

  9. Deep Reinforcement Learning 基础知识

    Introduction 深度增强学习Deep Reinforcement Learning是将深度学习与增强学习结合起来从而实现从Perception感知到Action动作的端对端学习的一种全新的算 ...

随机推荐

  1. 页面css代码

    博主原来的页面css代码 (这个是原来的那种效果,差不多弄出来会是这种效果http://www.cnblogs.com/thmyl/) /*simplememory*/ #google_ad_c1, ...

  2. JS版汉字与拼音互转终极方案,附简单的JS拼音

    前言 网上关于JS实现汉字和拼音互转的文章很多,但是比较杂乱,都是互相抄来抄去,而且有的不支持多音字,有的不支持声调,有的字典文件太大,还比如有时候我仅仅是需要获取汉字拼音首字母却要引入200kb的字 ...

  3. 记录下我的阿里云centos服务器之路

    以下内容都已经过试验,边走边记,懒得排版 安装aphach yum install -y httpd systemctl start httpd netstat -tulp    安装桌面 尽量不用桌 ...

  4. sublime去除空白行和重复行

    去除空白行 edit -> line -> delete blank lines 去除重复行 打开正则模式 1 edit-> sort lines 2 command+option+ ...

  5. linux命令lsattr、chattr、man

    1.man命令,可以查看手册 配置位置/etc/man.conf MANPATH决定手册查询位置 MANSECT决定man查询的顺序 man的查询 linux man的常用用法: man sectio ...

  6. linux下crontab使用笔记

    1. 安装     service crond status     yum install vixie-cron     yum install crontabs 2. 实例         每分钟 ...

  7. java Excel导入、自适应版本、将Excel转成List<map>对象

    转载:http://blog.csdn.net/u012662357/article/details/58593020 最近在web开发中遇到excel批量导入,在网上搜了下很少有将excel直接转成 ...

  8. HDU 4355 Party All the Time(三分|二分)

    题意:n个人,都要去參加活动,每一个人都有所在位置xi和Wi,每一个人没走S km,就会产生S^3*Wi的"不舒适度",求在何位置举办活动才干使全部人的"不舒适度&quo ...

  9. Cloudera

    官方文档: http://www.cloudera.com/content/cloudera/en/documentation/core/latest/ 博客教程 http://www.wangyon ...

  10. typedef 与 define 的区别

    1.区别 (1)定义.执行时间.作用域 定义.执行时间: #define pchar char * typedef char *pchar; 定义的格式差别,显而易见的,要注意,define 是不能存 ...